0% found this document useful (0 votes)

38 views18 pages

Unit 6

The document discusses code optimization techniques including machine-independent optimizations like common subexpression elimination, copy propagation, dead code elimination, and constant folding. It also covers machine-dependent optimizations and loop optimizations such as code motion, induction variable elimination, and reduction in strength. An example of applying these optimizations to a quicksort code fragment is provided.

Uploaded by

Yash Waghmare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views18 pages

Unit 6

Uploaded by

Yash Waghmare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Unit-VI: Code Optimization

Code Generation

Code optimization
Under this topic we cover machine-independent optimizations. Machine dependent optimizations, such
as register allocation and utilization of special machine-instruction sequences (machine idioms) are
covered under next topic “ Code generation “.

The principal sources of optimization: There are some useful code-improving transformations. A
transformation is called local if it can be performed by looking only at the statements in a basic block;
otherwise, it is called global. Usually, local transformations are done first.
Function-preserving transformations: Some function-preserving transformations are: 1. Common
subexpression elimination 2. Copy propagation 3. Dead-code elimination 4. Constant folding.
1. Common subexpression elimination: An expression E is called a common subexpression if E
was previously computed, and the values of variables in E have not changed since the previous
computation.
2. Copy propagation: Assignments of the form a:=b are called copy statements or copies. Copy
statements are created due to some optimization algorithms like algorithm for common
subexpression elimination. The idea behind copy-propagation transformation is to use b for a
wherever possible after copy statement a:=b.
3. Dead-code elimination: If a value of a variable is not used after a certain point, then it is dead at
that point. Similarly, dead code or useless code is the statements computing values, which never
gets used. Copy propagation often turns the copy statement into dead code.
4. Constant folding: If the value of an expression is constant at compile time, then constant can be
used instead of an expression. This is called constant folding.

Loop optimizations: Since programs spend most of their time in loop, especially inner loops, loops are
very important places for optimization. Some important loop optimization techniques are: 1. Code
motion 2. Induction-variable elimination 3. Reduction in strength.
1. Code motion: Loop-invariant computation (expression whose value does not change during loop
iterations) can be moved before the loop. This decreases the amount of code inside a loop.
2. Induction-variable elimination: If a:=b*4 is an assignment then every time if b increases by 1, a
increases by 4. Here a and b are induction variables. When there are two or more induction
variables in a loop, it may be possible to eliminate all except one.
3. Reduction in strength: Here a cheaper one replaces expensive operation. For ex. addition may
replace multiplication.

Ex. C code for quicksort :

void quicksort(m,n)
int m,n;
{
int i,j; int v,z; if(n<=m) return;
/* fragment begins here */
i=m-1; j=n; v=a[n];
while(1) {
do i=i+1; while (a[i]<v); do j=j-1; while (a[j]>v); if(i>=j) break;
x=a[i]; a[i]=a[j]; a[j]=x; } x=a[i]; a[i]=a[n]; a[n]=x;
/* fragment ends here */ quicksort(m,j); quicksort(i+1,n);
}
Three-address code for fragment above:
1. i:=m-1 2.j:=n 3. t1:=4*n 4. v:=a[t1] 5. i:=i+1 6. t2:=4*i
7.t3:=a[t2] 8. if t3<v goto (5) 9. j:=j-1 10.t4:=4*j 11. t5:=a[t4]
12. if t5>v goto (9) 13. if i>=j goto23 14. t6:=4*i 15. x:=a[t6] 16. t7:=4*i
17. t8:=4*j 18. t9:=a[t8] 19. a[t7]:=t9 20. t10:=4*j 21. a[t10]:=x 22. goto (5)
23. t11:=4*i 24. x:=a[t11] 25. t12:=4*i 26. t13:=4*n 27. t14:=a[t13]
28. a[t12]:=t14 29. t15:=4*n 30. a[t15]:=x

Basic block: Basic block is a sequence of consecutive statements in which flow of control enters at
the beginning and leaves at the end without halt or branching except at the end.
Flow graph: B1
i:=m-1
j:=n
t1:=4*n
v:=a[t1]

B2
i:=i+1
t2:=4*i
t3:=a[t2]
If t3<v goto B2

B3
j:=j-1
t4:=4*j
t5:=a[t4]
If t5>v goto B3
B4

If i>=j goto B6

B5 B6
t6:=4*i t11:=4*i
x:=a[t6] x:=a[t11]
t7:=4*i t12:=4*i
t8:=4*j t13:=4*n
t9:=a[t8] t14:=a[t13]
a[t7]:=t9 a[t12]:=t14
t10:=4*j t15:=4*n
a[t10]:=x a[t15]:=x
goto B2

Common subexpression elimination: After removing common subexpression 4*i and 4*j, block B5
becomes
t6:=4*I
x:=a[t6]
t8:=4*j
t9:=a[t8]
a[t6]:=t9
a[t8]:=x
goto B2
Now, using global common subexpression elimination 4*i and 4*j can be replaced by t2 and t4,
respectively. Therefore, B5 becomes:

x:=a[t2]
t9:=a[t4]
a[t2]:=t9
a[t4]:=x
goto B2

Now a[t2] and a[t4] can be replaced by t3 and t5, respectively. Finally, B5 becomes:

x:=t3
t9:=t5
a[t2]:=t9
a[t4]:=x
goto B2

By similar transformations B6 becomes:

L.C.S.E. G.C.S.E G.C.S.E.

t11:=4*i x:=a[t2] x:=t3

x:=a[t11] t14:=a[t1] t14:=a[t1]
t13:=4*n a[t2]:=t14 a[t2]:=t14
t14:=a[t13] a[t1]:=x a[t1]:=x
a[t12]:=t14
a[t13]:=x

After copy propagation transformation B5 becomes:

x:=t3
t9:=t5
a[t2]:=t5
a[t4]:=t3
goto B2

After dead-code elimination B5 becomes:

A[t2]:=t5
A[t4]:=t3
Goto B2

After copy propogation transformation B6 becomes:

x:=t3
t14:=a[t1]
a[t2]:=t14
a[t1]:=t3
After dead-code elimination B6 becomes:

t14:=a[t1]
a[t2]:=t14
a[t1]:=t3

Code motion is not applicable to the quicksort ex. given above.

In a loop consisting of B3 itself, j and t4 are induction variables. By applying, induction variable
elimination and reduction in strength, we replace assignment t4:=4*j by t4:=t4-4. We place an
initialization of t4 at the end of block where j itself is initialized. (We add t4:=4*j at the end of block
B1.) Similar transformation is done in block B2 also.
Now the only use of i and j is in a test in block B4, which can be replaced by t2>=t4. Now i and j
become dead variables and the code assigning values to them is also dead code.
Final flow graph:

i:=m-1
j:=n B1
t1:=4*n
v:=a[t1]
t2:=4*i
t4:=4*j

t2:=t2+4
t3:=a[t2]
If t3<v goto B2

B3
t4:=t4-4
t5:=a[t4]
If t5>v goto B3

t2>=t4 goto B6

B5 B6

a[t2]:=t5
a[t4]:=t3 t14:=a[t1]
goto B2 a[t2]:=t14
a[t1]:=t3

The dag representation of basic blocks: Directed acyclic graphs (dags) are useful data structures for
implementing transformation on basic blocks. Using dag, we can determine common subexpressions
within a block, determine names which are used inside the block but evaluated outside the block, and
determine which statements of the block could have their computed value used outside the block.
A dag for a basic block is directed acyclic graph with the following labels on nodes:
1. Leaves are labeled by unique identifiers, which are either variable names or constants.
2. Interior nodes are labeled by an operator symbol
3. Nodes are also optionally given a sequence of identifiers for labels.

Ex.
1. t1:=4*i
2. t2:=a[t1]
3. t3:=4*i
4. t4:=b[t3]
5. t5:=t2*t4
6. t6:=prod+t5
7. prod:=t6
8. t7:=i+1
9. i:=t7
10. if i<=20 goto (1)

Given above is a basic block. dag for this block is as follows:

+ t6,prod

prod0 * t5

[] t2 [] t4 <= (1)

* t1,t3 + t7,I 20

a b 4 i0 1

Application of dags: 1. We can automatically detect common subexpressions. 2. We can determine

which identifier values are used in the block. They are those for which a leaf is created at some time. 3.
We can determine which statements compute the values that can be used outside the block. 4. We can
reconstruct a simplified list of quadruples taking advantage of common subexpressions and avoiding
copies like x:=y unless absolutely necessary. Assuming that no temporary is needed outside the block,
simplified list of quadruples is as follows:

t1:=4*i
t2:=a[t1]
t4:=b[t1]
t5:=t2*t4
prod:=prod+t5
i:=i+1
if i<=20 goto (1)

Loops in flow graphs:

Dominators: If every path from initial node to n goes through d, then node d dominates node n.
Domination information can be presented using dominator tree.
Ex.
1

5 6

9 10

Fig.: Flow graph

2 3

5 6 7

9 10

Fig.: Dominator tree for above flow graph

Natural loops: Natural loops can be easily improved. Using dominator information, we can find natural
loops in a flow graph. Natural loops have following properties:
1. It has a single entry point called header. Header dominates all nodes in the loop.
2. There is at least one way to iterate the loop.

Back edge is an edge whose head dominate its tail. (if ab is an edge, b is the head and a is the tail.)
Natural loop for back edge nd is d plus the set of nodes that can reach n without going through d.
In flow graph above, back edges are 74, 107, 43, 83, 91 and corresponding natural loops are
(4,5,6,7,8,10}, {7,8,10}, {3,4,5,6,7,8,10} (for both edges 43 and 83), entire flow graph,
respectively.

Pre-Header: Many code-optimization transformations need to move statements before the header.
Preheader is a new block created for this purpose.

Reducible flow graphs: In reducible flow graph, there are no jumps into the middle of the loops from
outside. The only entry to a loop is through its header.
A flow graph is reducible if its edges can be partitioned into two disjoint groups as follows:
1. The forward edges form an acyclic graph in which every node can be reached from the initial node of
G. 2. The back edges consist only of edges whose heads dominate their tails.

Flow graph given above follows these conditions and therefore is reducible.
However following flow graph does not follow these conditions and is therefore nonreducible.

2 3
Many languages form only reducible flow graphs as long as goto’s are not used.

Data-flow equations ( Global data flow analysis): Data flow equation is of the form :
out[S]=gen[S] U ( in[S]-kill[S])
Information at the end of statement is either generated within the statement, or enters at the beginning
and is not killed as control flows through the statement.
Data flow information (Data flow analysis) can be used to find chances of constant folding. Algorithms
for code motion and induction variable elimination also use this information.
A definition of a variable x is a statement that assigns a value to x. (Reaching definition) A definition d
reaches a point p if there is a path from the point immediately following d to p, such that d is not killed
along the path.

Iterative solution of data flow equations:

Program for illustrating reaching definitions:

/* d1 */ i:=m-1;
/* d2 */ j:=n;
/* d3 */ a:=u1;
do
/* d4 */ i:=i+1;
/* d5 */ j:=j-1;
if e1 then
/* d6 */ a:=u2
else
/* d7 */ i:=u3
while e2

Flow graph for illustrating reaching definitions:

d1: i:=m-1 gen[B1]={d1,d2,d3}

B1 d2: j:=n kill[B1]={d4,d5,d6,d7}
d3: a:=u1

d4: i:=i+1 gen[B2]={d4,d5}

B2 d5: j:=j-1 kill[B2]:={d1,d2,d7}

gen[B3]={d6}
d6: a:=u2 B3 kill[B3]={d3}

gen[B4]={d7}
B4 d7: i:=u3 kill[B4]={d1,d4}
in[B2]=out[B1] U out[B3] U out[B4]
=111 0000 + 000 0010 + 000 0001 =111 0011
out[B2]=gen[B2] U (in[B2]-kill[B2])
=000 1100 + ( 111 0011 – 110 0001 ) = 001 1110

Computation of in and out:

Block B Initial Pass 1 Pass2

In[B} Out[B] In[B] Out[B] In[B] Out[B]
B1 000 0000 111 0000 000 0000 111 0000 000 0000 111 0000
B2 000 0000 000 1100 111 0011 001 1110 111 1111 001 1110
B3 000 0000 000 0010 001 1110 000 1110 001 1110 000 1110
B4 000 0000 000 0001 001 0111 001 0111 001 1110 001 0111

From the second pass onwards there is no change in out sets, so the algorithm terminates.

Computation of ud-chains: ud-chains(use-definition) chains are computed from reaching defintion

information.
Flow graph:

d1 i:=2
d2 j:=i+1 B1

d3 i:=1 B2

d4 j:=j+1 B3

B4 d5 j:=j-4

In the flow graph above, there are three uses of names: d2 uses i, and d4 and d5 use j.
ud-chain of i in d2 is only d1.
Since the use of j at d4 in B3 is not preceded by a definition of j in B3, we have to consider in[B3] (
computed using in and out). in[B3]={d2,d3,d4,d5}. Out of these except d3, all are definitions of j, so the
ud-chain of j in d4 is d2,d4,d5.
Since the use of j at d5 of block B4 is not preceded by a definition of j in B4, we have to consider
in[B4]. in[B4]={d3,d4}. Out of these, only d4 defines j, so the ud-chain of j in d5 is only d4.
Application of ud-chains: If there is only one definition of name A which reaches a point p, and that
definition is A:=5, then we know that A has the value 5 at that point, and we can substitute 5 for A if
there is a use of A at point p.
Ex.: in[B5]={d3,d4,d5}. Out of these, only d3: i:=1 is a definition of i. Therefore, if there were a use of i
in B5 that preceded any definition of i in B5, it could be replaced by a use of the constant 1.

Code generation
Our target machine is a byte-addressable machine with four bytes per word and n general purpose
registers, R0, R1, . . . , Rn-1. It has two-address instructions of the form
Op source destination
Some commonly used instructions are:
MOV (move source to destination)
ADD (add source to destination)
SUB (subtract source from destination)

Various address modes:

Mode Form Address Added cost

Absolute M M 1
Register R R 0
Indexed c(R) c+contents(R) 1
Indirect register *R contents(R) 0
Indirect indexed *c(R) contents(c+contents(R) 1

Instruction costs: Instruction cost is one plus the cost associated with source and destination address
mode. Instruction cost corresponds to the length of instruction because in most machines time taken to
fetch an instruction is more than time taken to execute it.

Ex. MOV b,R0

ADD c,R0 cost=6
MOV R0,a

MOV b,a
ADD c,a cost=6

MOV * R1,* R0
ADD *R2 , R0 cost=2

ADD R2,R1
MOV R1,a cost=3

A Simple code generator:

A code-generation algorithm: Input to code-generation algorithm is a sequence of three-address

statements of a basic block. Following actions are performed for each three-address statement of the
form x := y op z:
1. Call a function getreg to determine the location L where the result of the computation y op z is to be
stored. L is usually a register, but it could also be a memory loacation.
2. Consult the address descriptor for y to determine y’, (one of) the current location(s) of y. Prefer the
register for y’ if the value of y is currently both in memory and a register. If the value of y is not already
in L, generate the instruction MOV y’,L to place a copy of y in L.
3.Generate the instruction OP z’,L where z’ is a current location of z. Again prefer a register to a
memory location if z is in both. Update the address descriptor of x to indicate that x is in location L. If L
is a register, update its descriptor to indicate that it contains the value of x, and remover x from all other
register descriptors.
4. If the current value of y and/or z are not live on exit from the block, and are in registers, modify the
register descriptor to indicate that after execution of x:=y op z , those registers no longer contain y
and/or z.
In a special case where three-address statement is a copy statement like x := y. If y is in a register, just
change the register and address descriptor to indicate that the value of x is not found only in the register
holding the value of y.
The function getreg: This function returns the location L to hold the value of x for the assignment x := y
op z. This function uses the previously collected next-use information.
1. If the name y is in a register that holds the value of no other names ( copy statement may a register to
hold the value of two or more variables simultaneously), and y is not live and has no next use after
execution of x := y op z, then return the register of y for L. Update the address descriptor of y to indicate
that y is no longer in L.
2. If above step fails, then return an empty register for L if available.
3. IF step 2 also fails, do as follows: If x has a next use in the block, op is an operator, such as indexing,
that requires a register, find an occupied register R. Store the value of R into a memory location( by
MOV R,M) if it is not already in a proper memory location M, update the address descriptor for M, and
return R. If R holds the value of several variables, a MOV instruction should be generated for each
variable to be stored. A suitable occupied register is one whose value is not required in near future, or
whose value is also in memory.
4. If x is not used in the block, or no suitable occupied register is found, select the memory location of x
as L.

Ex. The assignment d:=(a-b) + (a-c) + (a-c) may be translated into the following three-address
statements:
t1:=a-b
t2=a-c
t3:=t1+t2
d:=t2+t3
( Here d is live at end.)

The code generation algorithm produces the code below for the three-address statements given above.

Statements Code generated Register descriptor Address descriptor

Registers empty

t1:=a-b MOV a,R0 R0 contains t1 t1 in R0

SUB b,R0

t2:=a-c MOV a,R1 R0 contains t1 t1 in R0

SUB c,R1 R1 contains t2 t2 in R1

t3:=t1+t2 ADD R1,R0 R0 contains t3 t2 in R1

R1 contains t2 t3 in R0

d:=t2+t3 ADD R1,R0 R0 contains d d in R0

MOV R0,d d in R0 and memory

The code generation algorithm produces following code for indexed assignments:
Statement I in register Ri i in memory Mi i in stack
Code cost code cost Code Cost
A:=b[i] MOV b(Ri),R 2 MOV Mi,R 4 MOV Si(A),R 4
MOV b(R),R MOV b(R),R

A[I]:=b MOV b,a(Ri) 3 MOV Mi,R 5 MOV Si(A),R 5

MOV b,a(R) MOV b,a(R)

In case of stack, we assume that i is on the stack at offset Si and the pointer to the activation record for i
is in the register A.

The code generation algorithm produces following code for pointer assignments:

Statement P in register Rp P in memory Mp P in stack

Code Cost Code Cost Code Cost
A:=*p MOV *Rp,a 2 MOV Mp,R 3 MOV Sp(A),R 3
MOV *R,R MOV *R,R

p:=a MOV a,Rp 2 MOV Mp,R 4 MOV a,R 4

MOV a,*R MOV R,*Sp(A)

Conditional Statements: We use the instruction CJ<=z which means If the condition code is negative or
zero, jump to z.
For ex. if x<y goto z can be implemented by
CMP x,y
CJ<z
Similarly, x:=y+z if x<0 goto z can be implemented by

MOV y,R0
ADD z,R0
MOV R0,x
CJ < z
Peephole optimization: This technique is simple but effective to locally improve the target code. A short
sequence of the target instructions (called the peephole) is examined and replaced by shorter or faster
sequence, wherever possible. We regard peephole optimization as a technique to improve the quality of
the target code, but the same technique can also be applied directly after intermediate code generation to
improve the intermediate representation.
The peephole is a small, moving window on the target program. The code in the peephole need not be
contiguous, but some implementations require this. Each improvement may create new opportunities for
further improvement. Therefore repeated passes over the target code may be necessary to get the
maximum benefit. Given below are the program transformations usually done in peephole optimization
technique:
Redundant-instruction elimination ( Redundant Loads and Stores):
In the following instruction sequence instruction (2) can be deleted.
(1) MOV R0,a
(2) MOV a,R0
In case if (2) had a label we won’t be sure (1) is always executed immediately before (2) and we could
not remove (2). In other words (1) and (2) should be in the same block for this transformation to be
safe.
Unreachable code: Unreachable instruction can be removed. An unlabeled instruction immediately
following an unconditional jump can be removed. For ex., given below is a C program fragment:
#define debug 0
…
if(debug) {
print debugging information

An intermediate representation of the above program fragment may be :

if debug = goto L1
goto L2
L1: print debugging information
L2:
One obvious peephole optimization is to eliminate jumps over jumps. Therefore the resulting code is:
if debug  1 goto L2
print debugging information
L2:

Now since debug is set to 0 at the beginning of the program (we should do a global “reaching definition”
data flow analysis to find out the definition of debug reaching if statement), constant propagation will
give us following code:

if 0  1 goto L2
print debugging information
L2:

The first line in the above code can be replaced by goto L2 , since the condition is always true. Now all
the statements which print debugging information are unreachable and can be eliminated.

Flow-of-control optimizations: The intermediate code generation algorithms frequently produce jumps
to jumps, jumps to conditional jumps, etc. These unnecessary jumps can be eliminated.
For ex. the sequence
if a<b got L1
…
L1: goto L2

can be replaced by
if a<b goto L2
…
L1: goto L2

Algebraic simplification: For ex. statements (in this specific ex. these are algebraic identities) such as
below may be produced by simple code generation algorithms:
x:=x+0
or
x:=x*1
Such statements can be easily eliminated by peephole optimization.

Reduction in strength: We can replace expensive operations by equivalent cheaper operations available
on the target machine. For ex., x2 is cheaper to implement to as x*x than as a call to exponentiation
routine, fixed-point multiplication or division by a power of two is cheaper to implement as a shift.

Use of machine idioms: The target machines may have hardware instructions which implements some
specific operations efficiently. For ex. some machines have auto-increment and auto-decrement
addressing modes. They add or subtract one from an operand before or after using its value. These
modes improves the code substantially if used when pushing or popping a stack, as required in
parameter passing.

Generating code from dags: Now we will generate code for a basic block from its dag representation.
From a dag it is easier to make out the order of the final computation sequence than from a linear
sequence of three-address statements or quadruples. When dag is a tree, we can generates code that can
be proved to be optimal under criteria such as program length or the fewest no. of temporaries used.
Ex. 1 Given below is an ex. which shows how the order of computation can affect the cost of resulting
object code.
Basic block: t1:= a+b
t2:= c+d
t3:= e-t2
t4:= t1-t3
dag:
- t4

+ t1 - t3

a0 b0 e0 + t2

c0 d0

Now, using code-generation algorithm, we get the following code sequence ( We assume that two
registers R0 and R1 are available, and only t4 is live on exit from the given block.) :

MOV a,R0
ADD b,R0
MOV c,R1
ADD d,R1
MOV R0,t1
MOV e,R0
SUB R1,R0
MOV t1,R1
SUB R0,R1
MOV R1,t4

Now, we rearrange the order of statements such that t1 is computed just before t4, as shown below:

t2:= c+d
t3:= e-t2
t1:= a+b
t4:= t1-t3

Again using code-generation algorithm, we get following code sequence. Here we have saved two
instructions: MOV R0,t1 ( stores the value of R0 in memory location t1) and MOV t1,R1 (reloads the
value of t1 in R1). :

MOV c,R0
ADD d,R0
MOV e,R1
SUB R0,R1
MOV a,R0
ADD b,R0
SUB R1,R0
MOV R0,t4

For efficient computation of t4, left argument of t4 must be in a register. Therefore, t1 i.e. left operand of
t4 was computed just before t4. That is why above reordering improved the code. Given below is an
algorithm that whenever possible makes evaluation of a node immediately follow the evaluation of its
leftmost argument. This is called heuristic ordering algorithm ( Node listing algorithm) . It gives the
ordering in reverse order.

while unlisted interior nodes remain do begin

select an unlisted node n, all of whose parents have
been listed;
list n;
while the leftmost child m of n has no unlisted parents
and is not a leaf do
/* since n was just listed, m is not yet listed */
begin
list m;
n:= m
end
end

Ex. 2

a dag:
* 1

+ 2 - 3

* 4

- 5 + 8

+ 6 c 7 d 11 e 12

a 9 b 10

Now applying the above node listing algorithm, gives use the following order: 1234568. Therefore,
reversing this list we get 8654321. Therefore, the corresponding sequence of three-address statements is:
t8:= d+e
t6:= a+b
t5:= t6-c
t4:= t5*t8
t3:= t4-e
t2:= t6+t4
t1:=t2*t3
We use an algorithm called labeling algorithm to determine the optimal order of evaluation of
statements in a basic block when dag representation of the block is tree. Optimal order gives the shortest
instruction sequence.
Labeling algorithm has two parts: The first part labels each node of the tree, bottom-up, with an integer
that denotes the fewest no. of registers required to evaluate the tree with no storage of intermediate
results. The second part is a tree traversal whose order is governed by the computed node labels. The
output code is generated during the tree traversal.
Given the two operands of a binary operator, this algorithm evaluates the operand requiring more
registers first. If both the operands require the same no. of registers, any operand can be evaluated first.

Postorder is always a proper order to do the label computations.

Labeling algorithm (first part: label computation):

1. If n is a leaf then
2. if n is the leftmost child of its parent then
3. label(n):=1
4. else label(n):=0
else begin /* n is an interior node */
5. let n1,n2,…,nk be the children of n ordered by label,
so label(n1)>=label(n2)>=…….>=label(nk);
6. label(n):=max(label(ni)+i-1)
1<=i<=k
end

In a special important case when n is a binary node and its children have labels l1 and l2, the formula of
line 6 reduces to

Label(n)= max(l1,l2) if l1/=l2

= l1+1 if l1=l2

Ex. Three-address code:

t1:=a+b
t2:=c+d
t3:=e-t2
t4:=t1-t3

dag: t4

+ t1 - t3

a0 b0 e0 + t2

c0 d0
labeled tree:
t4 2
t1 1 t3 2

e 1 t2 1

a 1 b 0 c 1 d 0

Therefore, two registers are needed to evaluate t4 (also for t3).

Code generation from a labeled tree: Our code generation algorithm takes a labeled tree T as input and
produces a machine code sequence that evaluates T as an output. This algorithm uses a recursive
procedure gencode(n) to produce machine code which evaluates the subtree of T with root n into a
register. The procedure gencode uses a stack rstack to allocate registers (Suppose the no. of registers
available are r) . It also uses stack tstack to allocate temporary memory locations.

procedure gencode(n);
begin
/* case 0 */
if n is a left leaf representing operand name and n is the leftmost child of its parent then
print ‘MOV’ || name || ‘ , ‘ || top(rstack)
else if n is an interior node with operator op, left child n1, and right child n2 then
/* case 1 */
if label(n2)=0 then begin
let name be the operand represented by n2;
gencode(n1);
print op || name || ‘,’ || top(rstack)
end
/* case 2 */
else if 1 <= label(n1) < label(n2) and label (n1) < r then begin
swap(rstack);
gencode(n2);
R:=pop(rstack); /* n2 was evaluated into register R */
gencode(n1);
print op || R || ‘,’ || top(rstack);
push(rstack,R);
swap(rstack)
end
/* case 3 */
else if 1<=label(n2)<=label(n1) and label(n2) < r then begin
gencode(n1);
R:=pop(rstack); /* n1 was evaluated into register R */
gencode(n2);
print op || top(rstack) || ‘,’ || R;
push(rstack,R);
end
/* case 4 ,both labels >= r , the total no. of registers */
else begin
gencode(n2);
T:= pop(tstack);
print ‘MOV’ || top(rstack) ||’,’ || T;
gencode(n1);
push(tstack,T);
print op || T || ’,’ || top(rstack)
end
end

Ex. We can generate code for the labeled tree given above. Suppose rstack= R0,R1 initially. Trace of
gencode routine is shown below. In alongside brackets contents of rstack at the time of each call is
shown, with the top at the right end.

gencode(t4) [R1R0] /* case 2 */

gencode(t3) [R0R1] /* case 3 */
gencode(e) [R0R1] /* case 0 */
print MOV e,R1
gencode(t2) [R0] /* case 1 */
gencode(c) [R0] /* case 0 */
print MOV c,R0
print ADD d,R0
print SUB R0,R1
gencode(t1) [R0] /* case 1 */
gencode(a) [R0] /* case 0 */
print MOV a,R0
print ADD b,R0
print SUB R1,R0

Algebraic properties like commutativity and associativity of operators can be used to replace a given
tree T by one with smaller labels (to avoid stores in case 4 of gencode ) and/or fewer left leaves ( to
avoid loads in case 0). For ex. we may replace the tree given below by the tree which follows. This will
reduce the no. of left leave by one and possible lower some labels also. This is possible because operator
+ is commutative.

+ max(2,l)

1 l

+ l

l 0

Since operator + is commutative as well as associative, cluster of nodes labeled + can be replaced by left
chain which follows. To minimize the label of root we have arrange Ti1 as one having largest label out
of T1,T2,T3,T4 , and also we have to ensure that Ti1 is not a leaf unless all of T1,…,T4 are.
+

T1 +

+ T4

T2 T3

+ Ti4

+ Ti3

Ti1 Ti2

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6439)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (642)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1174)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (997)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1855)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5145)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2133)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (463)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2010)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2788)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2884)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4088)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Vacuum Cleaner With Fuzzy Logic Controller
No ratings yet
Vacuum Cleaner With Fuzzy Logic Controller
2 pages
CT2353
No ratings yet
CT2353
5 pages
Notes Lecture9
No ratings yet
Notes Lecture9
22 pages
Unit 5
No ratings yet
Unit 5
4 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Roger Penrose - Is Quantum Mechanics Relevant To Understanding Consciousness
No ratings yet
Roger Penrose - Is Quantum Mechanics Relevant To Understanding Consciousness
7 pages
RCF L15P400
No ratings yet
RCF L15P400
3 pages
Metal Foam Thesis
100% (2)
Metal Foam Thesis
7 pages
Presentation On Statement Problem: Name: Kassahun Azezew PRN. 031 Advisor: Dr. Preeti Mulay
No ratings yet
Presentation On Statement Problem: Name: Kassahun Azezew PRN. 031 Advisor: Dr. Preeti Mulay
13 pages
90399-01 - Lock Post
No ratings yet
90399-01 - Lock Post
1 page
MBA Orientation Manual March 2021
No ratings yet
MBA Orientation Manual March 2021
9 pages
FPSC+PPSC 500+ QS
No ratings yet
FPSC+PPSC 500+ QS
35 pages
Too Much, Too Soon
No ratings yet
Too Much, Too Soon
20 pages
Analisa Zat Warna Dalam Kosmetik
No ratings yet
Analisa Zat Warna Dalam Kosmetik
9 pages
CLOs PLOs Mapping
No ratings yet
CLOs PLOs Mapping
4 pages
12 Divisor monobloque de válvulas
No ratings yet
12 Divisor monobloque de válvulas
6 pages
Buckling of Thin-Walled Conical Shells Under Uniform External Pressure
No ratings yet
Buckling of Thin-Walled Conical Shells Under Uniform External Pressure
14 pages
Information Technology POOL 5:4: Instruction To Students: Read The Information Below and Make Notes in Your Notebook
No ratings yet
Information Technology POOL 5:4: Instruction To Students: Read The Information Below and Make Notes in Your Notebook
7 pages
Class-X Ch-2 Polynomials (Maths Assignment)
No ratings yet
Class-X Ch-2 Polynomials (Maths Assignment)
2 pages
EPC - SD Series (Rectifier)
No ratings yet
EPC - SD Series (Rectifier)
4 pages
Strings: - A String Is A Sequence of Characters Treated As A Group - We Have Already Used Some String Literals
No ratings yet
Strings: - A String Is A Sequence of Characters Treated As A Group - We Have Already Used Some String Literals
48 pages
Properties of Pipe
No ratings yet
Properties of Pipe
5 pages
What is Ferrite, Cementite, Pearlite , Martensite, Austenite
No ratings yet
What is Ferrite, Cementite, Pearlite , Martensite, Austenite
5 pages
Sage X3 - User Guide - HTG-Setting Up Mandatory Check Printing PDF
No ratings yet
Sage X3 - User Guide - HTG-Setting Up Mandatory Check Printing PDF
3 pages
Jurnal Kombis
No ratings yet
Jurnal Kombis
14 pages
Documentation of PMSM
No ratings yet
Documentation of PMSM
89 pages
Growatt Hybrid sph5000 Manual
No ratings yet
Growatt Hybrid sph5000 Manual
66 pages
Creating RSA Keys Using OpenSSL
No ratings yet
Creating RSA Keys Using OpenSSL
10 pages
LG Cassette AC Catalogue
100% (1)
LG Cassette AC Catalogue
3 pages
Run-On Sentence: Definition
No ratings yet
Run-On Sentence: Definition
3 pages
Flowcharts and Diagrams
No ratings yet
Flowcharts and Diagrams
21 pages
Kamaya Products FCC English 20170207 094544-1842064
No ratings yet
Kamaya Products FCC English 20170207 094544-1842064
13 pages
IEEE Paper Review On Wireless Security and Loopholes
No ratings yet
IEEE Paper Review On Wireless Security and Loopholes
4 pages
Lab Part I
No ratings yet
Lab Part I
4 pages
Mac Nastran
No ratings yet
Mac Nastran
4 pages

Unit 6

Uploaded by

Unit 6

Uploaded by

Unit-VI: Code Optimization

Ex. C code for quicksort :

By similar transformations B6 becomes:

L.C.S.E. G.C.S.E G.C.S.E.

t11:=4*i x:=a[t2] x:=t3

After copy propagation transformation B5 becomes:

After dead-code elimination B5 becomes:

After copy propogation transformation B6 becomes:

Code motion is not applicable to the quicksort ex. given above.

Given above is a basic block. dag for this block is as follows:

Application of dags: 1. We can automatically detect common subexpressions. 2. We can determine

Loops in flow graphs:

Fig.: Flow graph

Fig.: Dominator tree for above flow graph

Iterative solution of data flow equations:

Flow graph for illustrating reaching definitions:

d1: i:=m-1 gen[B1]={d1,d2,d3}

d4: i:=i+1 gen[B2]={d4,d5}

Computation of in and out:

Block B Initial Pass 1 Pass2

Computation of ud-chains: ud-chains(use-definition) chains are computed from reaching defintion

Various address modes:

Mode Form Address Added cost

Ex. MOV b,R0

A Simple code generator:

A code-generation algorithm: Input to code-generation algorithm is a sequence of three-address

Statements Code generated Register descriptor Address descriptor

t1:=a-b MOV a,R0 R0 contains t1 t1 in R0

t2:=a-c MOV a,R1 R0 contains t1 t1 in R0

t3:=t1+t2 ADD R1,R0 R0 contains t3 t2 in R1

d:=t2+t3 ADD R1,R0 R0 contains d d in R0

A[I]:=b MOV b,a(Ri) 3 MOV Mi,R 5 MOV Si(A),R 5

Statement P in register Rp P in memory Mp P in stack

*p:=a MOV a,*Rp 2 MOV Mp,R 4 MOV a,R 4

An intermediate representation of the above program fragment may be :

while unlisted interior nodes remain do begin

Postorder is always a proper order to do the label computations.

Labeling algorithm (first part: label computation):

Label(n)= max(l1,l2) if l1/=l2

Ex. Three-address code:

Therefore, two registers are needed to evaluate t4 (also for t3).

gencode(t4) [R1R0] /* case 2 */

You might also like

p:=a MOV a,Rp 2 MOV Mp,R 4 MOV a,R 4