Chapter 9
Chapter 9
CHAPTER NINE
9. CODE OPTIMIZATION
9.1. Introduction
Optimization is a program transformation technique, which tries to improve the code by making it
consume less resources (i.e. CPU, Memory) and deliver high speed.
However, the code produced by straightforward compiling algorithms can often be made to run
faster or take less space, or both. This improvement is achieved by program transformations that
are traditionally called optimization.
In optimization, high-level general programming constructs are replaced by very efficient low-
level programming codes. A code optimizing process must follow the three rules given below:
The output code must not, in any way, change the meaning of the program.
Optimization should increase the speed of the program and if possible, the program should
demand less number of resources.
Optimization should itself be fast and should not delay the overall compiling process.
Efforts for an optimized code can be made at various levels of compiling the process.
At the beginning, users can change/rearrange the code or use better algorithms to write
the code.
After generating intermediate code, the compiler can modify the intermediate code by
address calculations and improving loops.
While producing the target machine code, the compiler can make use of memory
hierarchy and CPU registers.
Optimization can be categorized broadly into two types: machine independent and machine
dependent.
Machine - independent Optimization: In this optimization, the compiler takes in the
intermediate code and transforms a part of the code that does not involve any CPU registers
and/or absolute memory locations. It improves the code without taking into consideration
any properties of the target machine. For example:
do
{
item = 10;
value = value + item;
}while(value<100);
This code involves repeated assignment of the identifier item, which if we put this way:
item = 10;
do
{
value = value + item;
} while(value<100);
should not only save the CPU cycles, but can be used on any processor.
Figure 9.1: Places for potential improvements by the user and the compiler
The flow graph for the above three address code is the following.
Function-Preserving Transformation
There are a number of ways in which a compiler can improve a program without changing the
function it computes. Common-sub expression elimination, copy propagation, dead-code
elimination, and constant folding are common examples of such function-preserving (or
semantics-preserving) transformations.
Frequently, a program will include several calculations of the same value, such as an offset in an
array. Some of these duplicate calculations cannot be avoided by the programmer because they lie
below the level of detail accessible within the source language. For example, block B5 shown in
the following figure (a) recalculates 4* i and 4* j, although none of these calculations were
requested explicitly by the programmer.
Common Subexpressions
An occurrence of an expression E is called a common subexpression if E was previously
computed and the values of the variables in E have not changed since the previous computation.
We can avoid re-computing the expression if we can use the previously computed value.
Example-1: The assignments to t7 and t10 in Fig. 9.5(a) compute the common subexpressions
4*i and 4 * j, respectively. These steps have been eliminated in Fig. 9.4(b), which uses t6 instead
of t7 and t8 instead of t10.
Example-2: The following figure shows the result of eliminating both global and local common
subexpressions from blocks B5 and B6 in the flow graph of Fig. 9.4. We first discuss the
transformation of B5 and then mention some subtleties involving arrays.
After local common subexpressions are eliminated, B5 still evaluates 4*i and 4 * j , as shown in
Fig. 9.5(b). Both are common subexpressions; in particular, the three statements
t8 = 4*j
t9 = a[t8]
a[t8] = x
in B5 can be replaced by
t9 = a[t4]
a[t4] = x
using t4 computed in block B3. In Fig. 9.6, observe that as control passes from the evaluation of
4*j in B3 to B5, there is no change to j and no change to t4, so t4 can be used if 4 * j is needed.
Another common subexpression comes to light in B5 after t4 replaces t8. The new expression a[t4]
corresponds to the value of a[j] at the source level. Not only does j retain its value as control
leaves B3 and then enters B5, but a[j], a value computed into a temporary t5, does too, because
there are no assignments to elements of the array a in the interim. The statements
t9 = a[t4]
a[t6] = t9
Had we gone into more detail in Example-2 above, copies would have arisen much sooner, because the
normal algorithm for eliminating common subexpressions introduces them, as do several other algorithms.
Example-3: In order to eliminate the common subexpression from the statement c = d+e in Fig.
9.7(a), we must use a new variable t to hold the value of d+e. The value of variable t, instead of
that of the expression d+e, is assigned to c in Fig. 9.7(b). Since control may reach c = d+e
either after the assignment to a or after the assignment to b, it would be incorrect to replace
c = d+e by either c = a or by c = b.
The idea behind the copy-propagation transformation is to use v for u, wherever possible after the
copy statement u = v. For example, the assignment x = t3 in block B5 of Fig. 9.6 is a copy. Copy
propagation applied to B5 yields the code in Fig. 9.8. This change may not appear to be an
improvement, but it gives us the opportunity to eliminate the assignment to x.
Dead-Code Elimination
A variable is live at a point in a program if its value can be used subsequently; otherwise, it is
dead at that point. A related idea is dead (or useless) code - statements that compute values that
never get used. While the programmer is unlikely to introduce any dead code intentionally, it may
appear as the result of previous transformations.
Example-4: Suppose debug is set to TRUE or FALSE at various points in the program, and
used in statements like
One advantage of copy propagation is that it often turns the copy statement into dead code. For
example, copy propagation followed by dead-code elimination removes the assignment to x and
transforms the code in Fig 9.8 into
a[t2] = t5
a[t4] = t3
goto B2
This code is a further improvement of block B5 in Fig. 9.6.
Code Motion
Loops are a very important place for optimizations, especially the inner loops where programs
tend to spend the bulk of their time. The running time of a program may be improved if we
decrease the number of instructions in an inner loop, even if we increase the amount of code
outside that loop.
An important modification that decreases the amount of code in a loop is code motion. This
transformation takes an expression that yields the same result independent of the number of times
a loop is executed (a loop-invariant computation) and evaluates the expression before the loop.
Note that the notion "before the loop" assumes the existence of an entry for the loop, that is, one
basic block to which all jumps from outside the loop go.
section, we shall give the following examples of program transformations that are characteristic
of peephole optimizations:
Redundant-instruction elimination
Flow-of-control optimizations
Use of machine idioms
Now the argument of the first statement always evaluates to true, so the statement can be replaced
by goto L2. Then all statements that print debugging information are unreachable and can be
eliminated one at a time.
Flow-of-Control Optimizations
Simple intermediate code-generation algorithms frequently produce jumps to jumps, jumps to
conditional jumps, or conditional jumps to jumps. These unnecessary jumps can be eliminated in
either the intermediate code or the target code by the following types of peephole optimizations.
We can replace the sequence
goto L1
...
L1: goto L2
by the sequence
goto L2
...
L1: goto L2
If there are now no jumps to L1, then it may be possible to eliminate the statement L1:goto L2
provided it is preceded by an unconditional jump.
Similarly, the sequence
if (a < b) goto L1
...
L1: goto L2
can be replaced by the sequence
if (a < b) goto L2
...
L1: goto L2
Finally, suppose there is only one jump to L1 and L1 is preceded by an unconditional goto. Then
the sequence
goto L1
...
L1: if (a < b) goto L2
L3:
may be replaced by the sequence
if (a < b) goto L2
goto L3
...
L3:
While the number of instructions in the two sequences is the same, we sometimes skip the
unconditional jump in the second sequence, but never in the first. Thus, the second sequence is
superior to the first in execution time.