0% found this document useful (0 votes)
111 views50 pages

Three-Pass Compiler Optimization Techniques

The document discusses three-pass compilers, which use an intermediate representation (IR) to allow for optimization between the front and back ends, describes how optimizers work by applying transformations to the IR in multiple passes to improve aspects like performance and memory usage, and covers lexical analysis including how ad-hoc lexers can be implemented by hand but have limitations that are addressed by lexer generators.

Uploaded by

Mehtab Hashim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views50 pages

Three-Pass Compiler Optimization Techniques

The document discusses three-pass compilers, which use an intermediate representation (IR) to allow for optimization between the front and back ends, describes how optimizers work by applying transformations to the IR in multiple passes to improve aspects like performance and memory usage, and covers lexical analysis including how ad-hoc lexers can be implemented by hand but have limitations that are addressed by lexer generators.

Uploaded by

Mehtab Hashim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Lecture 3:

Compiler Construction

CS 401
DR. ABDUL MAJID, DCIS
Previous Lecture 2
IN PREVIOUS WE DISCUSSED
•TWO MAIN PART: FRONT END AND
BACK END

•COMPILER TYPES: TWO-PASS COMPILER

•COMPILER ANALYSIS: FRONT END AND BACK END

•COMPILER COMPORTS: SCANNER : LEX. REG EXP, TOKEN,


• LEXEMES

•PARSER: CONTEXT-FREE GRAMMARS

•ABSTRACT SYNTAX TREES

•BACK END COMPILER


Today Lecture 3 Contents
• COMPILER TYPES: THREE-PASS COMPILER

• OPTIMIZER

• LEXICAL ANALYSIS, TOKENS, AD-HOC LEXER


Three-pass Compiler 4

Source Front IR Middle IR Back machine


code End End End code

errors

Intermediatestage for code


improvement or optimization
Three-pass Compiler 5

Source Front IR Middle IR Back machine


code End End End code

errors

Analyzes IR and rewrites (or


transforms) IR
Three-pass Compiler 6

Source Front IR Middle IR Back machine


code End End End code

errors

Primary goal is to reduce running


time of the compiled code
Three-pass Compiler 7

Source Front IR Middle IR Back machine


code End End End code

errors

May also improve space usage,


power consumption, ...
Three-pass Compiler 8

Source Front IR Middle IR Back machine


code End End End code

errors
 Must preserve “meaning” of the code.
 Measured by values of named variables
Optimizer 9

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors
Modern optimizers are structured
as a series of passes
Optimizer 10

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Discover & propagate some
constant value
Optimizer 11

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Move a computation to a less
frequently executed place
Optimizer 12

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Specialize some computation based
on context
Optimizer 13

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Discover a redundant
computation & remove it
Optimizer 14

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Remove useless or unreachable
code
Optimizer 15

IR
IR Opt IR Opt IR Opt Opt IR
1 2 3 n

errors

Typical transformations
Encode an idiom in some
particularly efficient form
Role of Run-time System 16
 Memory management
 Allocate/deallocate
 Garbage collection
 Run-time type checking
 Error/exception processing
 Interface to OS – I/O
 Support for parallelism
 Parallel threads
 Communication and synchronization
Related to Compilers 17
 Interpreters (direct execution)
 Assemblers
 Preprocessors
 Text formatters (non-WYSIWYG)
 Analysis tools
Lexical Analysis
Recall: Front-End 19

source tokens IR
scanner parser
code

errors

Output of lexical analysis is a


stream of tokens
Tokens 20
Example:

if( i == j )
z = 0;
else
z = 1;
Tokens 21
 Input is just a sequence of characters:

i f ( \b i \b = = \b j \n \t ....
Tokens 22

Goal:
partition input string into
substrings
classify them according to their
role
Tokens 23

A token is a syntactic
category
Natural language:
“He wrote the program”
Words: “He”, “wrote”, “the”,
“program”
Tokens 24

Programming language:
“if(b == 0) a = b”
Words:
“if”, “(”, “b”, “==”, “0”,
“)”, “a”, “=”, “b”
Tokens 25

Identifiers:x y11 maxsize


Keywords: if else while for
Integers: 2 1000 -44 5L
Floats: 2.0 0.0034 1e5
Symbols: ( ) + * / { } < > ==
Strings: “enter x” “error”
Ad-hoc Lexer 26
Hand-write code to generate
tokens.
Partition the input string by
reading left-to-right,
recognizing one token at a time
Ad-hoc Lexer 27

Look-ahead required to decide


where one token ends and the
next token begins.
Ad-hoc Lexer 28

class Lexer
{
Inputstream s;
char next;//look ahead
Lexer(Inputstream _s)
{
s = _s;
next = s.read();
}
Ad-hoc Lexer 29

class Lexer
{
Inputstream s;
char next;//look ahead
Lexer(Inputstream _s)
{
s = _s;
next = s.read();
}
Ad-hoc Lexer 30

class Lexer
{
Inputstream s;
char next;//look ahead
Lexer(Inputstream _s)
{
s = _s;
next = s.read();
}
Ad-hoc Lexer 31

class Lexer
{
Inputstream s;
char next;//look ahead
Lexer(Inputstream _s)
{
s = _s;
next = s.read();
}
Ad-hoc Lexer 32

class Lexer
{
Inputstream s;
char next;//look ahead
Lexer(Inputstream _s)
{
s = _s;
next = s.read();
}
Ad-hoc Lexer 33

Token nextToken() {
if( idChar(next) )
return readId();
if( number(next) )
return readNumber();
if( next == ‘”’ )
return readString();
...
...
Ad-hoc Lexer 34

Token nextToken() {
if( idChar(next) )
return readId();
if( number(next) )
return readNumber();
if( next == ‘”’ )
return readString();
...
...
Ad-hoc Lexer 35

Token nextToken() {
if( idChar(next) )
return readId();
if( number(next) )
return readNumber();
if( next == ‘”’ )
return readString();
...
...
Ad-hoc Lexer 36

Token nextToken() {
if( idChar(next) )
return readId();
if( number(next) )
return readNumber();
if( next == ‘”’ )
return readString();
...
...
Ad-hoc Lexer 37

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 38

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 39

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 40

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 41

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 42

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 43

Token readId() {
string id = “”;
while(true){
char c = input.read();
if(idChar(c) == false)
return
new Token(TID,id);
id = id + string(c);
}
}
Ad-hoc Lexer 44

boolean idChar(char c)
{
if( isAlpha(c) )
return true;
if( isDigit(c) )
return true;
if( c == ‘_’ )
return true;
return false;
}
Ad-hoc Lexer 45

Token readNumber(){
string num = “”;
while(true){
next = input.read();
if( !isNumber(next))
return
new Token(TNUM,num);
num = num+string(next);
}
}
Ad-hoc Lexer 46

Token readNumber(){
string num = “”;
while(true){
next = input.read();
if( !isNumber(next))
return
new Token(TNUM,num);
num = num+string(next);
}
}
Ad-hoc Lexer 47

Token readNumber(){
string num = “”;
while(true){
next = input.read();
if( !isNumber(next))
return
new Token(TNUM,num);
num = num+string(next);
}
}
Ad-hoc Lexer 48

Problems:
Do not know what kind of
token we are going to read
from seeing first character.
Ad-hoc Lexer 49

Problems:
If token begins with “i”, is it an
identifier “i” or keyword “if”?
If token begins with “=”, is it
“=” or “==”?
Ad-hoc Lexer 50

Need a more principled


approach
Use lexer generator that
generates efficient tokenizer
automatically.

You might also like