unit6
unit6
✔ Code Generation:
▪ Issues in the design of a code generator
▪ The target language
▪ Basic blocks and flow graphs
▪ Next-use information
▪ A simple code generator
▪ The DAG representation of basic blocks
✔ Code Optimization:
▪ Peephole optimization
▪ The principal sources of optimization,
▪ Loops in flow graphs.
Issues in Code Generation
Issues in Code Generation are:
1. Input to code generator
2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Choice of evaluation
7. Approaches to code generation
Input to code generator
Input to the code generator consists of the intermediate representation of the source program.
Types of intermediate language are:
1. Postfix notation
2. Quadruples
3. Syntax trees or DAGs
The detection of semantic error should be done before submitting the input to the code
generator.
The code generation phase requires complete error free intermediate code as an input.
Target program
The output may be in form of:
1. Absolute machine language: Absolute machine language program can be placed in a
memory location and immediately execute.
2. Relocatable machine language: The subroutine can be compiled separately. A set of
relocatable object modules can be linked together and loaded for execution.
3. Assembly language: Producing an assembly language program as output makes the
process of code generation easier, then assembler is require to convert code in binary
form.
Memory management
Mapping names in the source program to addresses of data objects in run time memory is done
cooperatively by the front end and the code generator.
We assume that a name in a three-address statement refers to a symbol table entry for the
name.
From the symbol table information, a relative address can be determined for the name in a data
area.
Instruction selection
Example: the sequence of statements
a := b + c
d := a + e
would be translated into
MOV b, R0
ADD c, R0
MOV R0, a
MOV a, R0
ADD e, R0
MOV R0, d
Here the fourth statement is redundant, so we can eliminate that statement.
Register allocation
The use of registers is often subdivided into two sub problems:
During register allocation, we select the set of variables that will reside in registers at a point in
the program.
During a subsequent register assignment phase, we pick the specific register that a variable will
reside in.
Finding an optimal assignment of registers to variables is difficult, even with single register
value.
Mathematically the problem is NP-complete.
Choice of evaluation
The order in which computations are performed can affect the efficiency of the target code.
Some computation orders require fewer registers to hold intermediate results than others.
Picking a best order is another difficult, NP-complete problem.
Approaches to code generation
The most important criterion for a code generator is that it produces correct code.
The design of code generator should be in such a way so it can be implemented, tested, and
maintained easily.
Target machine
Instruction Cost
The address modes together with the assembly language forms and associated cost as
follows:
The instruction cost can be computed as one plus cost associated with the source and
destination addressing modes given by “extra cost”.
Instruction Cost
Mode Form Address Extra cost
Absolute M M 1
Register R R 0
Indexed k(R) k +contents(R) 1
Indirect register *R contents(R) 0
Total Cost=6
Instruction Cost
Mode Form Address Extra cost
Absolute M M 1
Register R R 0
Indexed k(R) k +contents(R) 1
Indirect register *R contents(R) 0
Indirect indexed *k(R) contents(k + contents(R)) 1
Total Cost=2
Basic Blocks
A basic block is a sequence of consecutive statements in which flow of control enters at the
beginning and leaves at the end without halt or possibility of branching except at the end.
The following sequence of three-address statements forms a basic block:
t1 := a*a
t2 := a*b
t3 := 2*t2
t4 := t1+t3
t5 := b*b
t6 := t4+t5
Algorithm: Partition into basic blocks
Input: A sequence of three-address statements.
Output: A list of basic blocks with each three-address statement in exactly one block.
Method:
1. We first determine the set of leaders, for that we use the following rules:
I. The first statement is a leader.
II. Any statement that is the target of a conditional or unconditional goto is a leader.
III. Any statement that immediately follows a goto or conditional goto statement is a leader .
2. For each leader, its basic block consists of the leader and all statements up to but not
including the next leader or the end of the program.
Example: Partition into basic blocks
begin
prod := 0; Block B1 (1) prod := 0 Leader
i := 1; (2) i := 1
(3) t1 := 4*i Leader
do (4) t2 := a [t1]
prod := prod + a[t1] * b[t2]; (5) t3 := 4*i
(6) t4 :=b [t3]
i := i+1; (7) t5 := t2*t4
while i<= 20 (8) t6 := prod +t5
Block B2
(9) prod := t6
end (10) t7 := i+1
(11) i := t7
(12) if i<=20 goto (3)
prod=0 Block B1
i=1
t1 := 4*i
t2 := a [t1]
t3 := 4*i
Flow Graph t4 :=b [t3]
t5 := t2*t4 Block B2
t6 := prod +t5
prod := t6
t7 := i+1
i := t7
if i<=20 goto B2
Transformation on Basic Blocks
A number of transformations can be applied to a basic block without changing the set of
expressions computed by the block.
Many of these transformations are useful for improving the quality of the code.
Types of transformations are:
1. Structure preserving transformation
2. Algebraic transformation
Structure Preserving Transformations
Structure-preserving transformations on basic blocks are:
1. Common sub-expression elimination
2. Dead-code elimination
3. Renaming of temporary variables
4. Interchange of two independent adjacent statements
Common sub-expression elimination
Consider the basic block,
a:= b+c
b:= a-d
c:= b+c
d:= a-d
The second and fourth statements compute the same expression, hence this basic block may
be transformed into the equivalent block:
a:= b+c
b:= a-d
c:= b+c
d:= b
Dead-code elimination
Renaming of temporary variables
Suppose we have a statement
t:=b+c, where t is a temporary variable.
If we change this statement to
u:= b+c, where u is a new temporary variable,
Change all uses of this instance of t to u, then the value of the basic block is not changed.
In fact, we can always transform a basic block into an equivalent block in which each statement
that defines a temporary defines a new temporary.
We call such a basic block a normal-form block.
Interchange of two independent adjacent statements
Algebraic Transformation
Countless algebraic transformation can be used to change the set of expressions computed by
the basic block into an algebraically equivalent set.
The useful ones are those that simplify expressions or replace expensive operations by cheaper
one.
Example: x=x+0 or x=x*1 can be eliminated.
Computing Next Uses
The next-use information is a collection of all the names that are useful for next subsequent
statement in a block.
The use of a name is defined as follows,
Consider a statement,
x := i
j := x op y
That means the statement j uses value of x.
The next-use information can be collected by making the backward scan of the programming
code in that specific block.
Storage for Temporary Names
For the distinct names each time a temporary is needed. And each time a space gets allocated
for each temporary.
To have optimization in the process of code generation we pack two temporaries into the same
location if they are not live simultaneously.
Consider three address code as,
t1=a*a t1=a*a
t2=a*b t2=a*b
t3=4*t2 t2=4*t2
t4=t1+t3 t1=t1+t2
t5=b*b t2=b*b
t6=t4+t5 t1=t1+t2
Register and Address Descriptors
The code generator algorithm uses descriptors to keep track of register contents and addresses
for names.
Address descriptor stores the location where the current value of the name can be found at run
time. The information about locations can be stored in the symbol table and is used to access
the variables.
Register descriptor is used to keep track of what is currently in each register. The register
descriptor shows that initially all the registers are empty. As the generation for the block
progresses the registers will hold the values of computation.
Register Allocation & Assignment
Efficient utilization of registers is important in generating good code.
There are four strategies for deciding what values in a program should reside in a registers and
which register each value should reside.
Strategies are:
1. Global Register Allocation
2. Usage Count
3. Register assignment for outer loop
4. Register allocation for graph coloring
Global Register Allocation
Global register allocation strategies are:
The global register allocation has a strategy of storing the most frequently used variables in
fixed registers throughout the loop.
Another strategy is to assign some fixed number of global registers to hold the most active
values in each inner loop.
The registers are not already allocated may be used to hold values local to one block.
In certain languages like C or Bliss programmer can do the register allocation by using register
declaration to keep certain values in register for the duration of the procedure.
Example:
{
register int x;
}
Usage count
Register assignment for outer loop
Loop L1
L1-L2
Loop L2
Register allocation for graph coloring
The graph coloring works in two passes. The working is as given below:
In the first pass the specific machine instruction is selected for register allocation. For each
variable a symbolic register is allocated.
In the second pass the register inference graph is prepared.
In register inference graph each node is a symbolic registers and an edge connects two nodes
where one is live at a point where other is defined.
Then a graph coloring technique is applied for this register inference graph using k-color.
The k-colors can be assumed to be number of assignable registers.
In graph coloring technique no two adjacent nodes can have same color. Hence in register
inference graph using such graph coloring principle each node is assigned the symbolic
registers so that no two symbolic registers can interfere with each other with assigned physical
registers.
Algorithm: DAG Construction
We assume the three address statement could of following types:
Case (i) x:=y op z
Case (ii) x:=op y
Case (iii) x:=y
With the help of following steps the DAG can be constructed.
Step 1: If y is undefined then create node(y). Similarly if z is undefined create a node (z)
Step 2:
Case(i) create a node(op) whose left child is node(y) and node(z) will be the right child. Also
check for any common sub expressions.
Case (ii) determine whether is a node labeled op, such node will have a child node(y).
Case (iii) node n win be node(y).
Step 3: Delete x from list of identifiers for node(x). Append x to the list of attached identifiers for
node n found in 2.
DAG Representation of Basic Block
Example:
(1) t1 := 4*i
t6 , prod
(2) t2 := a [t1]
(3) t3 := 4*i
prod t5
(4) t4 :=b [t3]
(5) t5 := t2*t4 t2
t4 (1)
(6) t6 := prod +t5 t1 ,t3
(7) prod := t6 t7 , i 20
(8) t7 := i+1
a b
(9) i := t7 4 i 1
t1:=a+b t4
t2:=c+d
t3:=e-t2 t1
t3
t4:=t1-t3
Three Address Code e t2
a b
c d
Example: Rearranging Order
t1:=a+b t2:=c+d
Re-arrange
t2:=c+d t3:=e-t2
t3:=e-t2 t1:=a+b
t4:=t1-t3 MOV a, R0 t4:=t1-t3
Three Address Code ADD b, R0 Three Address Code
MOV c, R1 MOV c, R0
ADD d, R1 ADD d, R0
MOV R0, t1 MOV e, R1
MOV e, R0 SUB R0, R1
SUB R1, R0 MOV a, R0
MOV t1, R1 ADD b, R0
SUB R0, R1 SUB R1, R0
MOV R1, t4 MOV R0, t4
Assembly Code Assembly Code
Algorithm: Heuristic Ordering
Obtain all the interior nodes. Consider these interior nodes as unlisted nodes.
while(unlisted interior nodes remain)
{
pick up an unlisted node n, whose parents have been listed
list n;
while(the leftmost child m of n has no unlisted parent AND is not leaf)
{
List m;
n=m;
}
}
Example: Heuristic Ordering
Leftchild
Leftchild of of 21 ==
Parent 5 is not
Parent 1 islisted
listedsosocan’t
list 2list 6
11 12
10
Listed Node 1 2
Example: Heuristic Ordering
Leftchildof
Rightchild of13==
Parent
Parent 2,3
1 isare listed
listed so so
listlist
3 4
11 12
10
Listed Node 1 2 3 4
Example: Heuristic Ordering
Leftchild of 45 =
Parent
Parent2,5 are
4 is listed
listed soso list
list 56
11 12
10
Listed Node 1 2 3 4 5 6
Example: Heuristic Ordering
Rightchild of 4 =
Parent 4 is listed so list 8
11 12
10
Listed Node 1 2 3 4 5 6 8
Example: Heuristic Ordering
Listed Node 1 2 3 4 5 6 8
t8=d+e
t6=a+b
t5=t6-c Optimal
t4=t5*t8 Three
t3=t4-e Address
t2=t6+t4 code
11 12
t1=t2*t3
10
Labeling Algorithm
Example: Labeling Algorithm
t1:=a+b
t2:=c+d
t3:=e-t2
t4:=t1-t3
Three Address Code
t4
t3
t1
e t2
a b
c d
❑ Topics to be covered
✔ Optimization technique
▪ Compile time evaluation
▪ Common sub expressions elimination
▪ Copy Propagation / Variable Propagation
▪ Code movement / Loop invariant (Frequency reduction)
▪ Strength Reduction
▪ Dead code elimination
❖ Code Optimization
Generator
Source Front Intermediate Target
Code
Program End Code Program
▪ Example:
t1 := 4 * i
t1 := 4 * i
t2 := a[t1]
After Common t2 := a[t1]
t3 := 4*j
sub expressions elimination t3 := 4*j
t4 : = 4 * i
t4:= n
t5:= n
t5 := b[t1]+t4
t6 := b[t4]+t5
3. Copy Propagation / Variable Propagation
✔ Copy propagation means use of one variable instead of another.
▪ Example:
y = x;
z = 3 + y;
z = 3 + x;
✔ Here the variable y is eliminated. Here the necessary condition is that a variable must be
assigned to another variable or some constant.
4. Code movement / Loop invariant (Frequency reduction)
✔ There are two basic goals of code movement:
I. To reduce the size of the code.
II. To reduce the frequency of execution of code.
x = 25 * a ;
for (i=1; i <= 100 ; i++)
{ for (i=1; i <= 100 ; i++)
z=i; {
x = 25 * a ; z=i;
y=x+z; y=x+z;
} }
✔ Here x = 25 * a; is loop invariant. Hence in the optimized program it is computed only once
before entering the for loop. y = x + z; is not loop invariant. Hence it cannot be subjected to
frequency reduction
5. Strength Reduction
✔ The Strength reduction optimization replaces the occurrence of time
consuming operation ( a “ high strength” operation ) by a faster operation ( a
“low strength” operation ).
✔ For example: replacement of multiplication by an addition.
for(i=1;i<=50;i++) temp=0;
{ for(i=1;i<=50;i++)
count = i*50; {
} temp = temp+50;
count = temp;
}
✔ Here we get the count values as 50, 100, 150…. and so on.
✔ Here the high strength operator * in i*50 occurring in side the loop is replaced
by a low strength operator + in temp+50.
6. Dead code elimination
✔ The code which can be omitted from a program without affecting its results is called dead
code.
✔ The variable is said to be dead at a point in a program if the value contained into it is never
been used. The code containing such a variable supposed to be a dead code
✔ Example: main()
{
……..
a = 5;
if (a = = 5 )
{ Statement can be eliminated and optimization
c ++; can be done.
printf(“%d”, c);
}
else
{
k ++;
printf(“ this is a dead code”);
}
}
Peephole Optimization
Peephole optimization
Peephole optimization is a simple and effective technique for locally improving target code.
This technique is applied to improve the performance of the target program by examining the
short sequence of target instructions (called the peephole) and replacing these instructions by
shorter or faster sequence whenever possible.
Peephole is a small, moving window on the target program.
Redundant Loads & Stores
Especially the redundant loads and stores can be eliminated in following type of
transformations.
Example:
MOV R0,x
MOV x,R0
We can eliminate the second instruction since x is in already R0.
Flow of Control Optimization
The unnecessary jumps can be eliminated in either the intermediate code or the target code by
the following types of peephole optimizations.
We can replace the jump sequence.
Goto L1
…… Goto L2
L1: goto L2
If a<b goto L1
…… If a<b goto L2
L1: goto L2
Algebraic simplification
Peephole optimization is an effective technique for algebraic simplification.
The statements such as x = x + 0 or x := x* 1 can be eliminated by peephole optimization.
Reduction in strength
Certain machine instructions are cheaper than the other.
In order to improve performance of the intermediate code we can replace these instructions by
equivalent cheaper instruction.
For example, x2 is cheaper than x * x.
Similarly addition and subtraction are cheaper than multiplication and division. So we can
add effectively equivalent addition and subtraction for multiplication and division.
Machine idioms
The target instructions have equivalent machine instructions for performing some operations.
Hence we can replace these target instructions by equivalent machine instructions in order to
improve the efficiency.
Example: Some machines have auto-increment or auto-decrement addressing modes.
These modes can be used in code for statement like i=i+1.
Loops in Flow Graphs
Dominators
In a flow graph, a node d dominates n if every path to node n from initial node goes through d
only.
This can be denoted as ’d dom n'.
Every initial node dominates all the remaining nodes in the flow graph.
Every node dominates itself.
1
3 4
3 4
6
Inner Loops
The inner loop is a loop that contains no other loop.
Here the inner loop is 4🡪2 that mean edge given by 2-3-4.
5
Preheader
Preheader
Header
B0
Reducible Flow Graph
The reducible graph is a flow graph in which there are two types of edges forward edges and
backward edges.
These edges have following properties,
1. The forward edge forms an acyclic graph.
2. The back edges are such edges whose head dominates their tail.
3 4
5
Nonreducible Flow Graph
A non reducible flow graph is a flow graph in which:
1. There are no back edges.
2. Forward edges may produce cycle in the graph.
2 3
Global Data Flow Analysis
Global Data Flow Analysis
Global Data Flow Analysis
The details of how dataflow equations are set up and solved depend on three factors.
1. The notions of generating, i.e., on the data flow analysis problem to be solved. Moreover,
for some problems, instead of proceeding along with flow of control and defining out[s] in
terms of in[s], we need to proceed backwards and define in[s] in terms of out[s].
2. g and killing depend on the desired information Since data flows along control paths,
data-flow analysis is affected by the constructs in a program. In fact, when we write out[s]
we implicitly assume that there is unique end point where control leaves the statement; in
general, equations are set up at the level of basic blocks rather than statements, because
blocks do have unique end points.
3. There are subtleties that go along with such statements as procedure calls, assignments
through pointer variables, and even assignments to array variables.
Dataflow Properties
Data Flow Properties
A program point containing the definition is called Definition point.
A program point at which a reference to a data item is made is called Reference point.
A program point at which some evaluating expression is given is called Evaluation point.
B1: t1=4*i
B4: t4=a[t3]
Reaching Definition
D1: y=2 B1
D2: y=y+2 B2
D3: x=y+2 B3
Live variable
Busy Expression
THANK YOU