Overall Structure
Chuen-Liang Chen
Department of Computer Science and Information Engineering National Taiwan University
clchen@[Link]
Overall Structure Chuen-Liang Chen, NTU CS&IE / 1
Outline
Language Processors Phases of a Compiler Compiler Structure Compiler Construction Compiler Installation/Porting
Overall Structure
Chuen-Liang Chen, NTU CS&IE / 2
Language Processors
source program input Interprete r outpu t source program Compiler source program Compiler intermediate program input target program for hotspot
Overall Structure
inpu t
target program
outpu t
Virtual Machine (incl. interpreter) Just In Time Compiler
outpu t
Chuen-Liang Chen, NTU CS&IE / 3
Tool Chain
toolchain: a set of linked development tools text editorpreprocessorcompilerassemblerlinker/loader debugger, libraries GNU toolchain GNU make GCC GNU binutils (binary utilities) GNU debugger (gdb) libraries output of a compiler, may: assembly relocatable absolute binary other source
Overall Structure Chuen-Liang Chen, NTU CS&IE / 4
GNU Make
to automate the process of converting files rule format target: prerequisites (tab) command 1 command n target: file to be created, or action name prerequisites: input files to create the target commands: make process if any of the prerequisites change simple example helloWorld: helloWorld.o cc -o helloWorld helloWorld.o helloWorld.o: helloWorld.c cc -c helloWorld.c clean: rm helloWorld helloWorld.o usage: make or make clean
Overall Structure
Chuen-Liang Chen, NTU CS&IE / 5
Phases of a Compiler
Overall Structure
Chuen-Liang Chen, NTU CS&IE / 6
Analyses
lexical
syntax error
regular expression context-free grammar
a=b+;
syntactic (structure)
static semantic error int a , b ;
boolean c ; a=b+c;
semantics (meaning) static semantics
attribute grammar Vienna definition language, ...
run-time semantic error int a , b , c ;
run-time semantics
scanf(%d, &b); // assume: max int c=1; a=b+c;
Overall Structure
Chuen-Liang Chen, NTU CS&IE / 7
Phases of a Compiler Example
Overall Structure
Chuen-Liang Chen, NTU CS&IE / 8
Compiler Structure
syntax-directed translation driven by the syntactic structure of source program call graph (multi-passes) main IR code generator
parser token scanner
opt 1
opt 2
SS semantic routines
symbol table source machine attribute table code code SS : syntactic structure (abstract syntax tree) IR : intermediate representations
Overall Structure Chuen-Liang Chen, NTU CS&IE / 9
Intermediate Representation of a Compiler
front-end from source code to intermediate code analysis phases language dependent intermediate representation (intermediate code) back-end from intermediate code to target code synthesis phases machine dependent
arm back-end C front-end Fortran frontend Java front-end
Overall Structure
intermediate representatio n
i386 back-end mips back-end sparc back-end
Chuen-Liang Chen, NTU CS&IE /
GCC History
history / releases 1987 Gnu C Compiler 1997~1999 Experimental Gnu Compiler System (EGCS) pronounced eggs
EGCS & GCC reunited in 1999; winner: EGCS
2006/02 GCC 4.1.0 2010/04 GCC 4.5.0 2007/05 GCC 4.2.0 2011/03 GCC 4.6.0 2008/03 GCC 4.3.0 2011/10 GCC 4.6.2 design and development goals (from Mission Statement) new languages (front-end, language-dependent) new optimizations (middle-end) new targets (back-end, machine-dependent) improved runtime libraries faster debug cycle various other infrastructure improvements
1999 GCC Development Mission Statement 1999 Gnu Compiler Collection 2.95 2001 GCC 3.0 2005/04 GCC 4.0.0 2009/04 GCC 4.4.0
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
GCC Structure
source parsing gimplification build_cfg tree / cfg expand rtl / cfg machine independent machine dependent final assembly rtl optimizations tree optimizations
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Compiler Construction
hand-written compiler generator aka, compiler compiler how to use C to write a C compiler?
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Pascal P-code (1/5)
P-code Interpreter (A, Pc ) Sorting (A, D D) 65 53 94 06 63 06 53 63 65 94
Assembler (M, A M)
P-code interpreter (M, Pc ) Platform (Hardware + OS)
Sorting (M, D D)
(format, inputoutput)
Machine code, Assembly, P-code, Pascal, Data without high level language support
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Interpreter
software CPU int IR; // IR: instruction register int PC = 0; // PC: program counter int code[ ]; while(1) { // clock IR = code[PC]; // instruction fetching switch (IR) { // instruction decoding case 1: execution of opcode 1; break; case 2: execution of opcode 2; break; case n: execution of opcode n; break; // instruction execution // data fetching (using PC) } PC += instruction_length; // update program counter }
easier than a compiler
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Pascal P-code (2/5)
P-code Interpreter (A, Pc ) Pascal Compiler (Pa,Pa Pc) Pascal Compiler (Pc,Pa Pc) Sorting (Pa,D D)
Sorting (Pc,D D)
65 53 94 06 63
06 53 63 65 94
Assembler (M, A M)
P-code interpreter (M, Pc ) Platform (Hardware + OS)
(format, inputoutput)
Machine code, Assembly, P-code, Pascal, Data slower compilation, slower execution
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Pascal P-code (3/5)
P-code Interpreter (A, Pc ) Pascal Compiler (Pa,Pa Pc) Pascal Compiler (Pc,Pa Pc) Pascal Compiler (Pa,Pa M) Pascal Compiler (Pc,Pa M) Sorting (Pa,D D) 65 53 94 06 63 06 53 63 65 94
Assembler (M, A M)
P-code interpreter (M, Pc ) Platform (Hardware + OS)
Sorting (M, D D)
(format, inputoutput)
Machine code, Assembly, P-code, Pascal, Data slower compilation, faster execution
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Pascal P-code (4/5)
P-code Interpreter (A, Pc ) Pascal Compiler (Pa,Pa Pc) Pascal Compiler (Pa,Pa M) Sorting (Pa,D D) 65 53 94 06 63 06 53 63 65 94
Pascal Pascal Compiler Compiler (Pc,Pa Pc) (Pc,Pa M) P-code interpreter (M, Pc ) Pascal Compiler (M, Pa M)
Assembler (M, A M)
Sorting (M, D D)
Platform (Hardware + OS) (format, inputoutput)
Machine code, Assembly, P-code, Pascal, Data faster compilation, faster execution
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Pascal P-code (5/5)
P-code Interpreter (A, Pc ) Pascal Compiler (Pa,Pa Pc) Pascal Compiler (Pa,Pa A)
Pascal Pascal Pascal Compiler (A, Pa A) Compiler Compiler (Pc,Pa Pc) (Pc,Pa A) P-code interpreter (M, Pc ) Assembler Pascal Compiler (M, A M) (M, Pa A)
Assembler (M, A M)
Platform (Hardware + OS) (format, input output) u Machine code, Assembly, P-code, Pascal, Data easier implementation
Chuen-Liang Chen, NTU CS&IE /
Overall Structure
Compiler Installation/Porting
build a GCC on a Sparc workstation (build machine) run the GCC for a sorting program on an i386 PC (host machine) execute the sorting on an Arm embedded system build host target classification application example (target machine) A A A native compiler upgrade A A C cross developing env. for embedded system classification of compilers A B A cross-back A B B crossed native new CPU, powerful enough to run compiler A B C canadian
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Build, Host & Target Machines
compiler source
application source
compiler executable (B, src H) Build Machine
compiler executable (H, src T) Host Machine
application executable (T, ? ?) Target Machine
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Comparison of Different Versions
gcc 2.95 (C sparc) [any exe] sorting [src] gcc 4.6.2 (C sparc) [any exe] gcc 4.6.2 (C sparc) [sparc exe] {2.95 tech} gcc 4.6.2 (C sparc) [sparc exe] {4.6.2 tech} sorting [sparc exe] l better execution {4.6.2 tech} sorting [sparc exe] l worse execution {2.95 tech}
(s/w function) [code format] {code quality}
searchingl worse compilation [sparc exe] l {4.6.2 tech} same execution
searching [src]
searchingl compilation [sparc exe] better execution l same {4.6.2 tech}
Chuen-Liang Chen, NTU CS&IE /
Overall Structure
Native GCC Building
gcc 4.6.2 [Link] [src] build target host gcc 2.95 (C sparc) [sparc exe] gcc 4.6.2 (C sparc) [sparc exe] {2.95 tech} gcc 4.6.2 (C sparc) [sparc exe] {4.6.2 tech} gcc 4.6.2 (C sparc) [sparc exe] {4.6.2 tech} assume: each GCC stage bundles the existed binutils (as, ld) stage new function worse code quality stage new function better code quality stage same as stage converged library [sparc obj] {4.6.2 tech} helloWorld [sparc exe] {4.6.2 tech}
Chuen-Liang Chen, NTU CS&IE /
library [src] helloWorld [src]
Overall Structure
-c
-c
Cross GCC Building
assume: build machine (sparc) already has the newest native gcc & binutils cross binutils is required !!!
gcc [Link] [src] library [src] helloWorld [src]
gcc (Csparc)
[sparc exe]
binutils (sparc) [sparc exe] -c gcc (Carm) binutils (arm) [sparc exe]
binutils cpu-arm.c [src] library [arm obj] helloWorld [arm exe]
-c -S
[sparc exe]
-S library [arm asm]
Overall Structure
Chuen-Liang Chen, NTU CS&IE /
Compiler and Computer Science
compiler and programming language new language features new compilation challenge existed compiling problem modified language features compiler and architecture/platform new resources new compilation challenge optimization technology other fields in computer science
Overall Structure
Chuen-Liang Chen, NTU CS&IE /