Introduction to Code Generation
Rahul Singh
Noida International University
April 25, 2025
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 1 / 37
Introduction
The final phase of a compiler is the code generator.
It receives an intermediate representation (IR) with supplementary
information in the symbol table.
Produces a semantically equivalent target program.
Code generator main tasks:
Instruction selection
Register allocation and assignment
Instruction ordering
Code
Front end Code optimizer
Generator
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 2 / 37
Main Criterion
Correctness
The primary requirement for a code generator is that it should generate
correct code. This means that the generated machine or intermediate
code must faithfully reflect the original program’s logic and semantics.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 3 / 37
Input to the Code Generator
Inputs to the code generator:
Intermediate Representation (IR): Low-level, machine-independent
code.
Symbol Table: Contains details about variables, constants, types, and
scope.
Assumptions:
The front-end has already transformed source code to IR.
All syntactic and semantic errors have been handled.
Variables are directly usable in generated machine code.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 4 / 37
The Target Program
The code generator must produce machine code tailored for the
target architecture.
Common types of target architectures:
RISC (Reduced Instruction Set Computer): Simpler, faster
instructions.
CISC (Complex Instruction Set Computer): More complex
instructions.
Stack-based machines: Operate using operand stacks rather than
registers.
The target assumed here is a RISC-like machine with some
CISC-style addressing.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 5 / 37
Summary
The code generator must ensure correctness.
It operates on Intermediate Representation (IR) and assumes
error-free input.
Must produce code compatible with the target architecture.
Optional Enhancement
Would you like to add a visual diagram showing where the code generator
fits in the overall compiler pipeline?
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 6 / 37
Issues in the Design of Code Generator
The most important criterion: It must produce correct code.
Input to the code generator:
Intermediate Representation (IR) + Symbol Table
Front-end produces low-level IR: variables can be manipulated directly.
Syntactic and semantic errors are already detected.
Target program:
Common architectures: RISC, CISC, and Stack-based machines
We use a RISC-like machine with some CISC-like addressing modes
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 7 / 37
Complexity of Mapping
The level of the Intermediate Representation (IR)
The nature of the instruction-set architecture (ISA)
The desired quality of the generated code
Example: x = y + z
LD R0, y
ADD R0, R0, z
ST x, R0
Example: a = b + c; d = a + e
LD R0, b
ADD R0, R0, c
ST a, R0
LD R0, a
ADD R0, R0, e
ST d, R0
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 8 / 37
Optimization Perspective
In the previous example, variable a is stored and then reloaded.
This can be optimized by retaining a in a register instead of memory.
Optimized Code:
LD R0, b
ADD R0, R0, c
MOV R1, R0
ADD R0, R1, e
ST a, R1
ST d, R0
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 9 / 37
Register Allocation
Two Subproblems:
Register Allocation: Selecting the set of variables that will reside in
registers at each point in the program.
Register Assignment: Choosing the specific register (e.g., R0, R1) for
each variable.
Complications due to Hardware Architecture:
Example: Some architectures require register pairs for
multiplication/division.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 10 / 37
Example Code Snippet:
General Register Use Architecture-Conscious Use
t = a + b t = a + b
t = t * c t = t + c
T = t / d T = t / d
L R1, a L R0, a
A R1, b A R0, b
M R0, c M R0, c
D R0, d SRDA R0, 32
ST R1, t D R0, d
ST R1, t
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 11 / 37
A Simple Target Machine Model
Load operations:
LD r, x – Load the value at memory location x into register r.
LD r1, r2 – Load the value at the address contained in r2 into r1.
Store operations:
ST x, r – Store the value in register r into memory location x.
Computation operations:
OP dst, src1, src2 – Perform operation OP using src1 and src2,
and store result in dst.
Unconditional jumps:
BR L – Branch to label L.
Conditional jumps:
Bcond r, L – Branch to label L if condition on register r is met.
For example: BLTZ r, L (Branch if Less Than Zero)
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 12 / 37
Target Program for a Sample Call and Return
High-Level Program (procedure Target Code:
c and p):
100: ACTION // code for c
action
120: ST 364,#140// save return
call p 132: BR 200 // call p
action 140: ACTION // after return
halt 160: HALT // halt program
Procedure p: 200: ACTION // code for p
action 220: BR *364// return to caller
return
Memory Allocation Overview:
364 – Holds return address for procedure call
300{363 – Activation record for c (local vars, return info)
364{451 – Activation record for p
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 13 / 37
Target Program for a Sample Call and Return
This slide demonstrates how high-level procedure calls (e.g., C, Python)
are translated into low-level target machine code.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 14 / 37
High-Level Program: Procedures ‘c‘ and ‘p‘
Procedure c:
action
Do something
call p
Call procedure p
action
Resume and do something after return
halt
Stop execution
Procedure p:
action
Do something in p
return
Return to caller
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 15 / 37
Target Code (Machine Instructions)
Address Code Comment
100 ACTION Procedure ‘c‘ – first action
120 ST 364, #140 Store return address (140) into memory[364]
132 BR 200 Branch to procedure ‘p‘ (starts at 200)
140 ACTION Execute action after returning from ‘p‘
160 HALT End program execution
200 ACTION Procedure ‘p‘ – do something
220 BR *364 Return to address stored in memory[364]
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 16 / 37
Key Operation Explained
ST 364, #140 – Store return address (140) in memory location 364.
BR 200 – Branch to procedure p at address 200.
BR *364 – Return to address stored in memory[364], i.e., 140.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 17 / 37
Memory Allocation Overview
364: Holds the return address.
300–363: Activation record for procedure c.
364–451: Activation record for procedure p.
These regions help isolate each procedure’s variables and control data to
prevent interference.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 18 / 37
Summary
Demonstrates a simple calling convention:
ST to save return address
BR to jump to procedures
BR *addr to return to caller
Basis for advanced concepts:
Stack frames
Call stack
Recursion
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 19 / 37
Visualization: Procedure Call Stack
Procedure c
Return Addr: 140
Local Vars...
↓Call p
Procedure p
Return Addr: *364
Local Vars...
↓Return
Resume c at 140
Action & HALT
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 20 / 37
How It Works: Call and Return Flow
The caller c performs action, saves the return address (140), and
jumps to p.
Procedure p executes action, then returns using BR *364.
Control resumes at address 140 in c (action).
The program halts after completing.
Key Concepts
Indirect jump using memory for return addresses.
Activation records maintain local data and return points.
Enables modular design and recursion support in compilers.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 21 / 37
Stack Allocation
Assembly Instructions: Explanations:
LD SP, #stackStart ; LD SP, #stackStart:
initialize the stack Initialize stack pointer at base
... code for first proc ... address.
HALT ; ADD SP, SP, #size:
terminate execution
Allocate activation record
ADD SP, SP, #recordSize ;
space.
reserve stack frame ST *SP, #here+16: Store
ST *SP, #here + 16 ; save return address on stack.
return address
BR callee: Jump to callee’s
BR callee.codeArea ; branch
to callee code area.
BR *0(SP): Return by
; Return to caller indirect branch using top of
in callee: BR *0(SP) stack.
in caller: SUB SP, SP, #
recordSize SUB SP, SP, #size:
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 22 / 37
Stack Allocation: Overview
The stack is used to manage:
Procedure calls and returns
Local variables
Return addresses
Temporary data
We simulate this in assembly-level instructions.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 23 / 37
Assembly Instructions and Explanations
Assembly Instructions Explanations
LD SP, #stackStart LD SP, #stackStart
... code for first proc ... Initialize Stack Pointer
HALT ADD SP, SP, #recordSize
ADD SP, SP, #recordSize Allocate activation record
ST *SP, #here + 16 ST *SP, #here + 16
BR callee.codeArea Save return address
; Return to caller
BR callee.codeArea
in callee: BR *0(SP)
Branch to callee
in caller: SUB SP, SP,
#recordSize BR *0(SP)
Return via indirect branch
SUB SP, SP, #recordSize
Free stack frame
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 24 / 37
Summary of Stack-Based Procedure Call
Before the Call:
Allocate stack space
Store return address
Call the Procedure:
Jump to callee’s code
Inside Callee:
Execute instructions
Return using address from top of stack
After Return:
Deallocate stack frame
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 25 / 37
Key Concepts in Stack-Based Call Handling
The stack is used to manage nested procedure calls via activation
records.
Each call stores the return address and local data in its activation
record.
The callee returns to the caller using an indirect jump to the saved
address.
After return, the stack is popped to discard the callee’s frame.
This mechanism enables recursion and modular program structure.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 26 / 37
High-Level Procedure Flow
Procedure m: Procedure q:
action action
call q call p
action action
halt call q
Procedure p: action
action call q
return return
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 27 / 37
High-Level Procedure Flow
Goal: Understand how procedures interact via calls and returns in a
compiler environment using a stack.
This flow illustrates nested and recursive function calls across three
procedures:
m → Main
p → Simple procedure
q → Recursive procedure
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 28 / 37
Pseudocode Representation
Procedure m (Main): Procedure p: Procedure q
[colback=blue!5] [colback=green!5] (Recursive):
action action [colback=red!5]
call q return action
action call p
halt action
call q
action
call q
return
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 29 / 37
Key Concepts Illustrated
Call Stack Behavior:
Each call allocates a new activation record on the stack.
Return pops the record and resumes.
Recursive Calls:
q calls itself, creating multiple frames.
Nested Calls:
Call chain: m → q → p → q → q
Stack Allocation:
Each procedure instance manages its locals + return address.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 30 / 37
Summary Table
Procedure Characteristics
m Main program, calls q, halts
p Simple subroutine, single return
q Recursive, calls p, calls itself twice
This structure demonstrates stack-based memory management used in
compilers for nested and recursive calls.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 31 / 37
Target Code for Procedure m
100: LD SP, #600 // initialize the stack
108: ACTION // action
128: ADD SP, SP, #msize // reserve space on stack
136: ST *SP, #152 // save return address
144: BR 300 // call q
152: SUB SP, SP, #msize // restore SP
160: ACTION // action
180: HALT // terminate execution
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 32 / 37
Target Code for Procedure p
200: ACTION // perform p’s action
220: BR *0(SP) // return using address on top of st
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 33 / 37
Target Code for Procedure q (Part 1)
300: ACTION // perform q’s first action
320: ADD SP, SP, #qsize // allocate stack frame
328: ST *SP, #344 // save return address
336: BR 200 // call p
344: SUB SP, SP, #qsize // restore SP after return
352: ACTION // next action in q
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 34 / 37
Target Code for Procedure q (Part 2)
372: ADD SP, SP, #qsize // allocate for recursive call
380: BR *SP, #396 // push return address (assumed)
388: BR 300 // recursive call to q
396: SUB SP, SP, #qsize // restore SP
404: ACTION // third action in q
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 35 / 37
Target Code for Procedure q (Part 3)
424: ADD SP, SP, #qsize // allocate for final call
432: ST *SP, #440 // push return address
440: BR 300 // call q again
448: SUB SP, SP, #qsize // restore SP
456: BR *0(SP) // return to caller
600: ... // stack base address
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 36 / 37
Concept Summary
Each procedure call pushes a return address and allocates a frame.
Returns are handled via indirect jump using the return address from
the top of the stack.
Stack pointer (SP) is adjusted to manage the lifetime of activation
records.
Supports nested and recursive procedure calls using LIFO discipline.
Rahul Singh (Noida International University) Introduction to Code Generation April 25, 2025 37 / 37