4nd5 unit cd CodeGeneration
4nd5 unit cd CodeGeneration
Register Allocation
Machine Dependent
Optimization
Dr. N. Kalyani
Code Generation
Code Generation
• The final phase in our compiler model is the code generator.
• Input : Intermediate representation (IR) & symbol
• Three address code
• Postfix notation
• Abstract Syntax tree
• Output Target program
• Absolute code
• Relocatable code
• Assembly Code
• Target program must preserve the semantic meaning and be of high quality.
• A code generator has three primary tasks: instruction selection, register allocation
and assignment, and instruction ordering.
Issues in the Design of a Code
Generator
1 Input to the Code Generator
3 Instruction Selection
4 Register Allocation
5 Evaluation Order
1. Input to the Code Generator
• The input to the code generator is the intermediate representation of the
source program produced by the front end, along with information in the
symbol table that is used to determine the run-time addresses of the data
objects denoted by the names in the IR.
• The many choices for the IR include
• Three-address representations such as quadruples, triples, indirect
triples.
• Virtual machine representations such as bytecodes and stack-machine
code.
• Linear representations such as postfix notation.
• Graphical representations such as syntax trees and DAG's.
2. The Target Program
• The instruction-set architecture of the target machine has a significant impact
• RISC (reduced instruction set computer) : A RISC machine typically has many registers, three-
address instructions, simple addressing modes, and a relatively simple instruction-set architecture.
• CISC (complex instruction set computer) : CISC machine typically has few registers, two-address
instruc-tions, a variety of addressing modes, several register classes, variable-length instructions,
and instructions with side effects.
• Stack based : operations are done by pushing operands onto a stack and then performing the
operations on the operands at the top of the stack.
• Target Code forms
• Absolute machine-code : it can be placed in a fixed location in memory and immediately executed.
• Relocatable machine code : allows subprograms to be compiled separately. A set of relocatable
object modules can be linked together and loaded for execution by a linking loader. More
flexibility.
• Assembly code : makes the process of code generation somewhat easier. We can generate symbolic
instructions and use the macro facilities of the assembler to help generate code.
3. Instruction Selection
• The code generator must map the IR program into a code sequence that can be
executed by the target machine.
• the level of the IR
• the nature of the instruction-set architecture
• the desired quality of the generated code.
• If the IR is high level, the code generator may translate each IR statement into a
sequence of machine instructions using code templates. Such statement-by-
statement code generation, however, often produces poor code that needs further
optimization.
• If the IR reflects some of the low-level details of the underlying machine, then the
code generator can use this information to generate more efficient code sequences.
3. Instruction Selection
• The nature of the instruction set of the target machine has a strong effect on the difficulty
of instruction selection.
• If the target machine does not support each data type in a uniform manner, then each
exception to the general rule requires special handling.
• For example, every three-address statement of the form x = y + z, where x, y, and z are
statically allocated, can be translated into the code sequence
4. Register Allocation
• Registers are the fastest computational unit on the target machine, but we usually
do not have enough of them to hold all values.
• Instructions involving register operands are invariably shorter and faster than
those involving operands in memory, so efficient utilization of registers is
particularly important.
• The use of registers is often subdivided into two subproblems:
• Register allocation, during which we select the set of variables that will reside in registers at
each point in the program.
• Register assignment, during which we pick the specific register that a variable will reside in.
• Finding an optimal assignment of registers to variables is difficult, even with
single-register machines. Mathematically, the problem is NP-complete.
5. Evaluation Order
• The order in which computations are performed can affect the
efficiency of the target code.
• Some computation orders require fewer registers to hold intermediate
results than others.
A SIMPLE CODE GENERATOR
• Register and Address Descriptors:
A register descriptor is used to keep track of what is currently in each registers. The
register descriptors show that initially all the registers are empty.
An address descriptor stores the location where the current value of the name can
be found at run time.
A SIMPLE CODE GENERATOR
• Input :Three-address statements of the form x : = y op z constituting a basic block.
1. Invoke a function getreg to determine the location L where the result of the
computation y op z should be stored.
2. Consult the address descriptor for y to determine y’, the current location of y. Prefer
the register for y’ if the value of y is currently both in memory and a register. If the
value of y is not already in L, generate the instruction MOV y’ , L to place a copy of y in
L.
3. Generate the instruction OP z’ , L where z’ is a current location of z. Prefer a register to
a memory location if z is in both. Update the address descriptor of x to indicate that x
is in location L. If x is in L, update its descriptor and remove x from all other
descriptors.
4. If the current values of y or z have no next uses, are not live on exit from the block,
and are in registers, alter the register descriptor to indicate that, after execution of x :
= y op z , those registers will no longer contain y or z
The assignment d : = (a-b) + (a-c) + (a-c) might be translated into the following three-address code sequence:
REGISTER ALLOCATION AND ASSIGNMENT
Local register allocation
Register allocation is only within a basic block. It follows top-down approach.
Assign registers to the most heavily used variables
Traverse the block
Count uses
Use count as a priority function
Assign registers to higher priority variables first
Advantage
Heavily used values reside in registers
Disadvantage
Does not consider non-uniform distribution of uses
REGISTER ALLOCATION AND ASSIGNMENT
Need of global register allocation
Local allocation does not take into account that some instructions (e.g. those in
loops) execute more frequently. It forces us to store/load at basic block endpoints
since each block has no knowledge of the context of others.
To find out the live range(s) of each variable and the area(s) where the variable is
used/defined global allocation is needed. Cost of spilling will depend on
frequencies and locations of uses.
Register allocation depends on:
Size of live range
Number of uses/definitions
Frequency of execution
Number of loads/stores needed.
Cost of loads/stores needed.
REGISTER ALLOCATION AND ASSIGNMENT
Register allocation by graph coloring
Global register allocation can be seen as a
graph coloring problem.
Basic idea:
1. Identify the live range of each
variable
2. Build an interference graph that
represents conflicts between live ranges
(two nodes are connected if the variables
they represent are live at the same
moment)
3. Try to assign as many colors to the
nodes of the graph as there are registers
so that two neighbors have different
colors
PEEPHOLE OPTIMIZATION
• Redundant-instructions elimination
• Flow-of-control optimizations
• Algebraic simplifications
• Unreachable
PEEPHOLE OPTIMIZATION
• Redundant Loads And Stores:
If we see the instructions sequence
MOV R0,a
MOV a,R0
• Flows-Of-Control Optimizations:
The unnecessary jumps can be eliminated in either the intermediate code or the target code by the
following types of peephole optimizations. We can replace the jump sequence
goto L1
goto L1 if a < b goto L1
……..
…. ….
L1: if a < b goto L2
L1: goto L2 L1: goto L2
L3:
by the sequence by the sequence
by the sequence
goto L2 if a < b goto L2
If a < b goto L2
…. ….
goto L3
L1: goto L2 L1: goto L2
…….
L3:
PEEPHOLE OPTIMIZATION
• Algebraic Simplification:
Only a few algebraic identities occur frequently enough that it is worth considering implementing them.
For example, statements such as
x := x+0 or
x := x * 1
• Reduction in Strength:
Reduction in strength replaces expensive operations by equivalent cheaper ones on the target machine.
X2 → X*X
PEEPHOLE OPTIMIZATION
• Use of Machine Idioms:
The target machine may have hardware instructions to implement certain specific operations efficiently.
i:=i+1 → i++
i:=i-1 → i- -
• Unreachable Code:
Another opportunity for peephole optimizations is the removal of unreachable instructions. An
unlabeled instruction immediately following an unconditional jump may be removed.
#define debug 0
If debug ≠1 goto L2
….
Print debugging information
If ( debug ) {
L2: ……………………………
Print debugging information
}
Or
by the sequence
If debug =1 goto L1
If debug ≠0 goto L2
goto L2
Print debugging information
L1: print debugging information
L2: …………………
L2: