0% found this document useful (0 votes)
32 views78 pages

Chapter 4 - 6

compiler

Uploaded by

henamelese05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views78 pages

Chapter 4 - 6

compiler

Uploaded by

henamelese05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 78

Compiler Design(SENG-

4092)
Lecture 4: Semantic Analysis

Faculty of Computing & Software Engineering

October 8, 2024 Compiled by: Nathnael AMU, Arba Minch


Objective
 Introduction To Semantic Analysis
 Semantic Errors
 Syntax-directed Translation
 Attributes
 Dependency Graph
 Evaluation Order

2
A Step-Back

CHAPTER TWO
•Strings
•Regular
expressions
•Tokens
•Transition
diagrams
•Finite Automata 3
A Step-Back

CHAPTER THREE
•Grammars
•Derivations
•Parse-trees
•Top-down parsing (LL)
•Bottom-up paring (LR,
SLR,LALR)
4
We Need Some Tools

 To help in semantic analysis


 To help in intermediate code
generation
 Two such tools
 Semantic rules (Syntax-Directed
Definitions)
 Semantic actions (Syntax Directed
Translations)

5
Semantic Analysis

 Semantic Analysis checks the source program for


semantic errors.
 It uses the hierarchical structure determined by the
syntax analysis phase to identify the operators and
operands of expressions and statement.
 Semantic analysis performs type checking., it checks
that, whether each operator has operands that are
permitted by the source language specification.
Example: If a real numbers is used to index an array i.e.,
a[1.5] then the compiler will report an error. This error is
handled during semantic analysis.
6
What does a semantic analyzer do?
 Semantic analysis judges whether the syntax
structure constructed in the source program
derives any meaning or not.
 For Example
 int a = “value”;
 should not issue an error in lexical and syntax
analysis phase, as it is lexically and structurally
correct, but it should generate a semantic error as
the type of the assignment differs
 The following tasks should be performed
in semantic analysis:
 Scope resolution
 Type checking
 Array-bound checking 7
Semantic Errors

 These are some of the semantics errors


that the semantic analyzer is expected to
recognize:
 Type mismatch
 Undeclared variable

 Reserved identifier misuse

 Multiple declaration of variable in a scope.

 Accessing an out of scope variable.

 Actual and formal parameter mismatch.

8
Syntax Directed Translation
 Syntax Directed Translation
 Refers to a method of compiler implementation where
the source
language translation is completely driven by the
parser.
It is the translation of languages guided by context
free grammars.
 The parsing process and parse trees are used to
direct semantic analysis and the intermediate code
translation of the source program
 This can be a separate phase of a compiler or we can
augment the CFG with information to control the
semantic analysis and translation process. Such
9
Continued…
 Conceptually with both the syntax directed
translation and translation scheme we:

Parse the input token stream

Build the parse tree

Traverse the tree to evaluate the semantic
rules at the parse tree nodes.
Input string parse tree
dependency graph evaluation
order for semantic rules

Syntax directed translation: set of
semantic actions performed for each
production. Enclosed by { }

10
Syntax Directed Translation
 Describes output to generate for each input
construct.
 CFG in which program fragment(output

action, semantic action or semantic rue) is


associated with each production.
 Output action

 Computation of values for variable

belonging to the compiler


 Generate of intermediate code

 Printing of an error diagnostic

 Placement of some values in a table

11
Syntax Directed Definitions
 Syntax directed definition: semantic rule
in which each variable adds extra
variable. And each production adds
semantic rule
 are a generalization of context-free grammars
in which:
Grammar symbols have an associated set of
Attributes;
Productions are associated with Semantic Rules
for computing
 Such formalism
the values generates Annotated
of attributes.
Parse-Trees where each node of the tree is
a record with a field for each attribute 12
Continued…
 Syntax directed translation (SDT)
 Grammar +semantic rules= SDT
Informal notation
Steps to construct
Example annotated parse tree
EE+T| T {E.val=E.val+T.val}
Step1: generate parse tree
Step2: parse the tree left to
{E.val=T.val} right and top to bottom
TT*F|F {T.val=T.val *F.val}
Step3: whenever there is a
{T.val=F.val} reduction go to the production
Fnum {F.val=num.val}and carry out the action
Input 2+3*4

13
Syntax Directed Definitions
Attributed CFG Example
 Attributed grammar that calculates the
value of an expression,
Syntax Semantic
Rules Rules
 E→E + T E1.v= E2.v +
 E→T T.v
 T→T * F E.v = T.v
 T→F T1.v= T2.v *
 F→id F.v
 F→(E) T.v = F.v
F.v = 14
Annotated Parse Tree Example

15
Attributes
 Attributes: The properties of an entity are
called the attributes.
 The attributes of the production rules of the

grammars.
 There are two kinds of attributes namely:

1.Synthesized attributes. and


2.Inherited attributes.
 The attributes may be the address, value,
scope, type and others.

19
Types of Attributes
 Synthesized Attributes
Synthesized attributes: If the attributes of
the parent depend on the attributes of the
children then such attributes are called
synthesized attributes.

 Attribute of a node is defined in terms of:


 Attribute values at children of the node
 Attribute value at node itself

 SDD involving only synthesized


attributes is called 20
Types of Attributes
 Inherited Attributes
Inherited attributes: If the attributes of the
children depend on the attributes of the parent
then such attributes are called inherited attributes.
The children generally inherit the properties of the
parent.
 Attribute of a node is defined in terms of:
 Attribute values at parent of the node
 Attribute values at left siblings
 Attribute value at node itself
 SDD involving both synthesized and inherited
attributes is called L-attributed Definition
 NB: Terminals can have synthesized attributes, but not 21
Types of Attributes

22
Synthesized Attributes
Production Semantic Rules
L→En { print (E.val)}
E → E1 + T { E.val := E1.val + T.val}
E→T { E.val := T.val }
T → T1 * F { T.val := T1.val * F.val}
T→F { T.val := F.val }
F → (E) { F.val := E.val }
F → digit {F.val := digit.lexval}
 The first production L → E n, prints the value of
expression generated by E. L is the start symbol of the
grammar.
 The second production returns the val of expression
E1and the value of the type T to the value of expression
E.
 Similarly all the remaining productions are assigned 23
Synthesized Attributes…
Parse tree for the input string 3 * 5 + 4 n, constructed using the grammar and rules

24
Inherited Attributes…

Production Rules D  TL  int L


D→T L  int L , id
T → int  int L , id , id
T → real  int id , id , id
L → L , id
L → id

25
Inherited Attributes…
D→T L { L.in := T.type }
→ int {T.type := integer }
→ real {T.type := real }
→ L1 , id {L1.in := L.in ; addtype (id.entry, L.in)
→ id {addtype (id.entry, L.in)}
Input string is: real id1,id2
type is a synthesized attribute
in is an inherited attribute

26
Syntax directed translation and
attributes
 A syntax-directed definition ( SDD) is a context-
free grammar together with attributes and rules.
 Attributes are associated with grammar symbols

and rules are associated with productions.


 Attribute “val” Tracks Actual Value of Unsigned

Integer as Input is Scanned and Parsed.


 Translation is nothing but associating a value to

a grammar symbol.
 The syntax directed translation can be of two

types:
 1.Synthesized Translation.
 2. Inherited Translation.

27
S-attributed Grammar
 Definition
 An S-Attributed Definition is a
Syntax Directed Definition that
uses only synthesized attributes.
 Evaluation Order
 Semantic rules in an S-Attributed
Definition can be evaluated by a bottom-
up, or PostOrder, traversal of the parse-
tree
31
S-attributed Grammar
S-attributed Grammar Example

Syntax Semantic
Rules Rules
L → En print(E.val)
 E → E1 + E.val = E1.val +
T T.val E.val =
E → T T.val
 T → T1* F T.val = T1.val
T → F

 F → (E) F.val T.val =


 F → digit F.val
32
S-attributed Grammar
S-attributed Grammar Example
 The annotated parse-tree for the
input 3*5+4n from the above S-
attributed grammar is:

33
S-attributed Grammar

 Exercises
 Give
the annotated parse tree of
(3+4)*(5+6)n from the following
grammar
Syntax Semantic
Rules Rules
 L → En
print(E.val)
 E → E1 +
E.val = E1.val +
T T.val E.val =
 E → T
T.val
 T → T1 *
T.val = T1.val
F F.val
 T → F
T.val = F.val
 F → (E)
F.val = E.val 34
 F →
F.val
L-attributed Grammars
 Definition
 An L-Attributed Definition is a
Syntax Directed Definition that
uses both synthesized and Inherited
attributes.
 It
is always possible to rewrite a
syntax directed definition to use only
synthesized attributes,
 Evaluation Order.
 Inheritedattributes cannot be evaluated
by a simple PreOrder traversal of the 35
L-attributed Grammars

 An L-Attributed grammar that associates to


an identifier its type

PRODUCTI SEMANTIC RULE


ON L.in :=
 D → TL T.type
 T → int T.type :=int
 T → real eger
 L → L1, id T.type :=rea
 L → id l
36
L-attributed Grammars
 The annotated parse-tree for the input real id1, id2,
id3 from the above L-attributed grammar is:

 L.in is then inherited top-down the tree by the other


L-nodes.
 At each L-node the procedure addtype inserts into
the symbol table 37
L-attributed Grammars
 Identify the non-L-attributed
production
 Example:
 A→L L.h =
M f1(A.h)
M.h =
f2(L.s)
 A.s =
R.h =
Exampl f (M.s)
f43(A.h)
e: Q.h =
 A→Q
f5(R.s)
R
A.s =
f6(Q.s) 38
Implementing
SDD
 Dependency
Graphs
 S-Attributed
Definitions
 L-Attributed
Definitions

39
Graph
 The graph that shows the flow of
information which helps in computation of
various attribute values in a particular parse
tree.
 Implementing a SDD consists primarily in
finding an order for the evaluation of
attributes
 The attributes should be evaluated in a
given order because they depend on one 40
Inter-dependency of
Attributes
 A Dependency Graph shows the
interdependencies among the attributes of
the various nodes of a parse-tree
 Dependency Graphs are the most general
technique used to evaluate SDD with both
synthesized and inherited attributes.
T(j) ----D()---> E(i) if and only if there exists a
semantic action
such as E(i) := f (... T (j) ...) 41
Graph
 Algorithm for the construction of the
dependency graph
 for each node n in the parse tree do
 foreach attribute a of the grammar symbol
at n do
 Construct a node in the dependency graph for
a
 for each node n in the parse tree do
 for each semantic rule b := f (c1, c2, . . ., ck)
associated with
the production used at n do
 for i := 1 to k do
 Construct an edge from the node for ci to the 42
node for b;
Dependency
Graph

 Build the dependency graph for the


parse-tree of real id1, id2, id3

43
Examp
le
The dependency graph for
the parse-tree of real id1,
id2, id3
Evaluation
Order
 The evaluation order of semantic rules
depends from a Topological Sort derived
from the dependency graph
 A topological sort
 of a directed acyclic graph is any ordering m1,
m2, . . ., mk of the nodes of the graph such
that edges go from nodes earlier in the
ordering to later nodes.
 i.e., if mi→mk is an edge from mi to mk then mi
appears before mk in the ordering
 Any topological sort of a dependency
graph gives a valid order to evaluate the 45
Evaluating
Semantic Rules
20  Parse Tree methods
3
 At compile time evaluation order obtained from the
topological sort of dependency graph.
 Fails if dependency graph has a cycle
 Rule Based Methods
 Semantic rules analyzed by hand or specialized tools
at compiler
construction time
 Order of evaluation of attributes associated with a
production is pre-
determined at compiler construction time
 For this method, the dependency graph need not be
constructed
 Oblivious Methods
 Evaluation order is chosen without considering the
semantic rules. 46
47
Strongly Non-Circular Syntax Directed
Definitions

 Formalisms for which an attribute


evaluation order can be fixed at
compiler construction time
 Two kinds of strictly non-circular
definitions:
 S-Attributed and

 L-Attributed Definitions

48
Evaluation of S-Attributed Definitions

 Synthesized Attributes can be evaluated


by a bottom- up parser as the input is
being analyzed avoiding the construction
of a dependency graph.
 The parser keeps the values of the
synthesized attributes in its stack.
 Whenever a reduction A  α is made, the
attribute for A is computed from the attributes
of α which appear on the stack.
49
Evaluation of L-Attributed Definitions

 The following procedure evaluate L-Attributed


Definitions by mixing PostOrder (synthesized)
and PreOrder (inherited) traversal.
 Algorithm: L-Eval(n: Node)
 Input: Node of an annotated parse-tree.
 Output: Attribute evaluation.
 Begin
 For each child m of n, from left-to-right Do
 Begin
 Evaluate inherited attributes of m;
 L-Eval(m)
 End;
 Evaluate synthesized attributes of n
 End.
51
CHAPTER FIVE
TYPE CHECKING

Compiled By: Nathnael T.


52
Introduct
ion
21
0

53
Introduct
ion
21
0

54
Introduct
ion
21
0

 A compiler must check if the source


program follows semantic conventions
of the source language. This is called
static checking/compile time
 Dynamic Check
 executedduring execution of the
program/runtime
55
Examples of static
checks
 Type Checks
 Checks if an operator has the right type of operands
 Flow-of-Control Checks
 Statements that cause flow of control to leave a
construct must have some place to which to
transfer the flow of control
 For example, a break instruction in C that is not in
an enclosing
statement
 Uniqueness Checks
There are situations in which an object must be
defined exactly
once
 For example, labels(Switch statement calses) should
be unique in C++, in Pascal, an identifier must be
declared uniquely 56

Examples of static
checks
Type check

uniqueness

Flow of Control

57
Examples of Dynamic
Checks

 Array Out of
Bound int [] a =
new int [10]; for (int
i=0; i<20; ++i)
a[i] = i;
 Division by
zero
float a=50;
for (int i=0;
i<5; ++i)
a= a/i;
58
Type Checking
Introduction

 A compiler must check that a program follows


the Type Rules of a language.
 The Type Checker is a module of a compiler
devoted to type checking tasks
 Examples of Tasks
The operator mod is defined only if the operands
are integers;
 Indexing is allowed only on an array and the
index must be an
integer;
 A function must have a precise number of
arguments and the parameters must have a 59
Type
Checker
 Type checker verifies that the type of a
construct (constant, variable, array, list,
object)matches what is expected in its
usage context
 E.g.
 int x=“abc”;
 Some operators (+,-,*,/) are
“overloaded”; i.e, they can apply to
objects of different types
 Functions may be polymorphic; i.e, accept
arguments of different types.
 Type information produced by the type
checker may be needed when the code is 60
Types of
Types

 Basic types
 are atomic types that have no internal
structure as far as the programmer is
concerned
 They include types like integer, real, boolean,
character , and enumerated types

 Constructed types
 include arrays, records, sets, and structures
constructed from the basic types and/or other
61
Type
Checker

 Type checker can handle arrays,


pointers, statements and functions.
 The design of a Type Checker depends
on

 the syntactic structure of language constructs


(e.g. operator)

 the Type Expressions of the language (e.g.


int,float, array)

 the rules for assigning types to constructs 62


Type
Expressions

 The type of a language construct will be


denoted by a type expression
 A type expression is either a basic type
or formed by applying an operator called
type constructor to the type expression
1. A basic type is a type expression(e.g. int)
2. A type constructor is a type expression
(e.g. ptr:*int, then x:ptr)

63
Type Expressions
cont‟d
 A type constructor applied to a type
expression is a type expression.
Constructors include:
 Arrays
 If I in an index set and T is a type expression,
then array (I, T)
is a type expression
 Example: array[1..10] of int == array(10,int);
 Records
 The difference between products and records
is that, records
have names for record fields. 64
Type System
 Type System:

 Collection of rules for assigning type expressions to
the various
 part of a program
 Type Systems are specified using syntax directed
definitions
 A type checker implements a type systems
 Sound type system : is one where any program that
passes the static type checker cannot contain run-time
type errors. Such languages are said to be strongly typed
languages.
65
Specification of a Type
System
 The syntax directed definition for
associating a type
to an Identifier is:

 All the attributes are synthesized.


 Since P  D;E, all the identifiers will have
their types saved in 66
System

 The syntax directed definition for


associating a type
to an Expression is:

67
Specification of a Type
System

 The syntax directed definition for


associating a type
to a statement is:

 The type expression for a statement is either


void or type error. 68
Equivalence of Type
Expressions

 In the above SDD we compared the


type expressions using the equal
operator. However, such an operator
is not defined except perhaps
between basic types.
 A natural notion of equivalence is
 Structural equivalence
69
Structural
Equivalence

 two type expressions are structurally


equivalent if and only if they are
 the same basic types or
 E.g. integer is equivalent only to integer

 formed by applying the same constructor


to structurally equivalent types
 E.g. pointer (integer) is structurally
equivalent to pointer (integer)

70
Name type
ptr = *integer

Equivalence var
A : ptr;
B : ptr;
C : *integer;
D, E : *integer;
 In some language, types can be given
names
 Do the variables A, B, C, D, E have the
same type?
The answer varies from implementation to
implementation
 Name Equivalence:
 We have name equivalence between two type
expressions if and only if they are identical
 Structural equivalence
 Names are replaced by type expressions they define,
so two types are structurally equivalent if they
represent two structurally equivalent type
expressions when all names have been substituted 71
Type
Conversion
 Coercion is implicit type conversion done by
the compiler
 Explicit type conversion is done by the
programmer
 Example. What‟s the type of “x + y” if:
1. x is of type real;
2. y is of type int;
3. Different machine instructions are used for
operations on reals and integers.
 Depending on the language, specific
conversion rules must be adopted by the
compiler to convert the type of one of the
operand of +
72
 The type checker in a compiler can insert these
Type Coercion in
Expressions

 The SDD for coercion from integer to


real for a generic arithmetic
operation op is

73
CHAPTER SIX
INTERMEDIATE CODE GENERATION

Compiled By: Nathnael T. 74


Intermediate Code
Generation

75
Intermediate Code
Generation

 In a compiler, the front end translates


a source program into an intermediate
representation, and the back end
generates the target code from this
intermediate representation
 The use of a machine independent
intermediate code (IC) is:
 retargeting to another machine is
facilitated
 the optimization can be done on
the machine independent code 76
Intermediate
Languages

 Syntax trees,
 Postfix
notations, and
 Three-address
code

77
Syntax Tree

 A syntax tree (abstract tree) is a


condensed form of parse tree useful for
representing language constructs
E
+

E + E
a b
a b

a. Parse tree for a + b b. Abstract tree for a + b

78
Postfix Notation

 The postfix notation is practical for an


intermediate representation as the
operands are found just before the
operator

In fact, the postfix notation is a
linearized representation of a
syntax tree
 e.g.,1 * 2 + 3 will be represented in
the postfix notation as 1 2 *3 +

79
Three-Address
code
 The three address code is a sequence of
statements of the form:
X := Y op Z
 where: X, Y, and Z are names, constants or
compiler-generated
temporaries
 op is an operator such as integer or floating
point arithmetic operator or logical operator on
Boolean data
 Important Notes:
 No built-up arithmetic operator is permitted
 Only one operator at the right side of the
assignment is
possible, i.e., x + y + z is not possible
 It has been given the name three-address 80
Types of Three-Address Statements
Statement Format Comments
Assignment (binary operation) X := Y op Z Arithmetic and logical operators used
Assignment (unary operation) X := op Y Unary -, not, conversion operators used
Copy statement X := Y
Unconditional jump goto L
Conditional jump If X relop y goto L
Function call param X1 The parameters are specified by param
param X2 The procedure p is called by indicating the
… number of parameters n

param Xn
call p, n

Indexed arguments X := Y [I] X will be assigned the value at the address Y + I


Y[I] := X The value at the address Y + I will be assigned X

Address and pointer assignments X := & Y X is assigned the address of Y


X := *Y X is assigned the element at the address
*X = Y Y The value at the address X is assigned
Y
81
Syntax-Directed Translation into Three-
Address Code

 Syntax directed translation can be used


to generate the three-address code
 Generally,
 Either the three-address code is
generated as an attribute of the
attributed parse tree or
 The semantic actions have side effects
that write the three-address code
statements in a file

82
generation

 To this end the following functions are given:


 newtemp - each time this function is called, it
gives distinct names that can be used for
temporary variables
 newlabel - each time this function is called, it gives
distinct names
that can be used for label names
 In addition, for convenience, we use the notation gen
to create a
three-address code from a number of strings
 gen will produce a three-address code after
concatenating all the
parameters
 For example, if id1.lexeme = x, id2.lexeme =y and
83
id3.lexeme = z:
SDT of a Three Address
Code
Production Semantic Rules
S→id := E S.code := E.code || gen (id.lexeme, :=, E.place)
E→E1 + E2 E.place := newtemp;
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘+’, E2.place)
E→E1 * E2 E.place := newtemp;
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘*’, E2.place)
E→- E1 E.place := newtemp;
E.code := E1.code || gen (E.place, ‘:= uminus ’, E1.place)
E→(E1) E.place := newtemp;
E.code := E1.code
E→id E.place := id.lexeme;
E.code := ‘’ /* empty code */
 the attribute place will hold the value of the grammar symbol
 the attribute code will hold the sequence of three-address statements evaluating the grammar symbol
 the function newtemp returns a sequence of distinct names t1, t2, . . . in response to successive calls84
I Thank You!

100

You might also like