0% found this document useful (0 votes)
26 views

01 Intro

This document introduces compilers and programming languages. It discusses the evolution of programming languages from machine languages to modern high-level languages. It then outlines the structure of compilers, including lexical analysis, syntax analysis, semantic analysis, code generation, and optimization. Finally, it covers key concepts in programming languages such as scopes, blocks, environments, and parameter passing mechanisms.

Uploaded by

Md Ashik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

01 Intro

This document introduces compilers and programming languages. It discusses the evolution of programming languages from machine languages to modern high-level languages. It then outlines the structure of compilers, including lexical analysis, syntax analysis, semantic analysis, code generation, and optimization. Finally, it covers key concepts in programming languages such as scopes, blocks, environments, and parameter passing mechanisms.

Uploaded by

Md Ashik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Compilers: Principles, Techniques, and Tools

Chapter 1 Introduction

Dongjie He
University of New South Wales
https://dongjiehe.github.io/teaching/compiler/

29 Jun 2023

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 1 / 21
Outline

1 Programming Languages
Evolution
Basics
2 Language Processors
Compilers, Interpreters, and Other Language Processors
A Language-Processing System
3 Compiler Structure
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation and Optimization
Code Generation
Symbol-Table Management
4 Applications of Compiler Technology
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 2 / 21
Programming Languages Evolution

Evolution of Programming Languages


Machine Languages: 1940’s, sequences of 0’s and 1’s
programming was slow, tedious, and error-prone.
programs were hard to understand and modify.
Assembly Languages: early 1950’s
mnemonics of machine instructions
macros: parameterized shorthands for frequently used instructions
High-Level Languages: 1955 to present
programming is easier, more natural, and more robust.
General Purpose Languages: Fortran, Cobol, Lisp, C, C++, Java, ...
Domain-Specific Languages: NOMAD, SQL, Postscript, ...
Logic- and constraint-based Languages: Prolog, OPS5, ...
Other Classification: 0
Imperative languages: C, C++, Java, Rust, ...
Functional languages: ML, Haskell, OCaml, ...
Constraint logic languages: Prolog, Datalog, ...
Von Neumann language/Object-oriented language/Scripting
.
language
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 3 / 21
Programming Languages Basics

Programming Language Basics I


A language policy allows the compiler to decide an issue.
Static policy: the issue is decided at compile time.
Dynamic policy: the issue is decided at run time.
e.g., “static int x;”, “static” in Java enables the compiler to
determine the location of ‘x’ in memory.
The scope of a declaration of x is the region in which uses of x refer
to this declaration.
Static scope: can be determined by looking only at the program.
Dynamic scope: determined by the program runs.

int i; // global i (defvar x 1) ; global variable 'x' with value 1


void f(...) { (defun foo () (message "The value of 'x' in foo: %s" x))
int i; // local i (defun bar ()
i = 3; // use of local i (let ((x 2)) ; local variable 'x' with value 2
} (foo))) ; Call foo from within bar
x = i + 1; // use of global i (bar) ; Call bar from the global scope
A C/Java Program An Emacs-Lisp Program
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 4 / 21
Programming Languages Basics

Programming Language Basics II


A block is a grouping of declarations and statements.
C family languages use braces { and } to delimit a block.
Algol and Pascal use begin and end.
Syntax of blocks in C
stmt := block | · · · block := declarations stmts
Block structure: blocks nested inside each other
Static-scope of a declaration D (which belongs to block B) of name x
B is the most closely nested block containing D
the scope of D is all of B, except for any blocks B′ nested to any depth
within B, in which x is redeclared.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 5 / 21
Programming Languages Basics

Programming Language Basics III


Member Scope in Classes/Structures
the scope of a member declaration x in a class C extends to any
subclass C′ , except if C′ has a local declaration of the same name x
Explicit Access Control
public/protected/private
Parameter Passing Mechanisms
Actual Parameters/Formal Parameters
e.g., “A id(A p) {return p;} r = id(a); ”
#include <stdio.h> #include <stdio.h> procedure swap(a,b); Algol-60
void incrementByValue(int num) { void incrementByReference(int* num) { integer a, b;
num = num + 1; (*num) = (*num) + 1; begin
printf("Inside function: %d\n", num); printf("Inside function: %d\n", *num); integer tmp;
} } tmp := a; a := b; b := tmp;
int main() { int main() { end;
int x = 5; int x = 5; integer i, x;
printf("Before function call: %d\n", x); printf("Before function call: %d\n", x); integer array arr[0:9];
incrementByValue(x); incrementByReference(&x); x := 2;
printf("After function call: %d\n", x); printf("After function call: %d\n", x); for i := 0 step 1 until 9 do arr[i] := 10 - i;
return 0; return 0; swap(x, arr[x]);
} } print(x); printArray(arr);

(a) Call by Value (b) Call by Reference . . . . . . (c)


. Call
. . by. Name
. . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 6 / 21
Programming Languages Basics

Programming Language Basics III


Environments and States
environment: a mapping from names to locations
state: a mapping from locations to their values

e.g., “x = y + 1” changes the value in the location denoted by name x


Aliasing
x and y are aliases of each other if they can refer to the same location
#include <stdio.h>
int main() {
int x[3] = {1, 2, 3};
int* y = x;
y[1] = 4;
printf("%d", x[1]);
return 0;
} . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 7 / 21
Language Processors Compilers, Interpreters, and Other Language Processors

Compilers and Interpreters

Compilers
source program Compiler target program
e.g., hello.c e.g., gcc e.g., hello.o

Interpreters
source program
e.g., hello.py Interpreter output
input e.g., python3

Hybrid Compilers
Translator source program
e.g., javac e.g., hello.java

intermediate program
e.g., hello.class Virtual Machine output
input e.g., JVM
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 8 / 21
Language Processors Compilers, Interpreters, and Other Language Processors

Other Language Processors

Preprocessors: C Preprocessor, “/usr/bin/cpp”


modify source code before compilation
allow to write code in a more convenient and expressive manner
macro expansion, conditional compilation, file inclusion, constant or
variable substitution
Assemblers
translate assembly language code into relocatable machine code
specific to the target architecture or processor
Linkers: GNU Linker, “/usr/bin/ld”
combine object files to create an executable program or a library
static linking/dynamic linking
Loaders: put all executable object files into memory for execution
Debuggers: GNU Debugger, “/usr/bin/gdb”
Breakpoints, Stepping, Variable or Memory inspection, Call stack
analysis, ...
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 9 / 21
Language Processors A Language-Processing System

A Language-Processing System

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 10 / 21
Compiler Structure

Phases of a Compiler
front end or analysis part

character stream
intermediate representation

Lexical Analyzer Machine-Independent


Code Optimizer
token stream
intermediate representation
Syntax Analyzer
Code Generator
syntax tree

Semantic Analyzer target-machine code

syntax tree Machine-Dependent


Code Optimizer
Intermediate Code Generator target-machine code

back end or synthesis part

Symbol Table
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 11 / 21
Compiler Structure Lexical Analysis

Lexical Analysis
lexical analysis/scanning
grouping characters into meaningful lexemes sequences
lexeme = ⟨token-name, attribute-value⟩
token-name: abstract symbol used during syntax analysis
attribute-value: stored in symbol-table and used in semantic analysis
and code generation
An example

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 12 / 21
Compiler Structure Syntax Analysis

Syntax Analysis
syntax: describes the proper form of programs
syntax analysis/parsing
transform tokens into an Intermediate Representation, e.g., syntax tree
IR depicts the grammatical structure of the token stream
An example

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 13 / 21
Compiler Structure Semantic Analysis

Semantic Analysis
semantics: define what programs mean
semantic analysis
check semantic consistency with the language definition
gather and save type information in syntax tree or symbol table
type checking and type conversions (coercions)
An example

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 14 / 21
Compiler Structure Intermediate Code Generation and Optimization

Intermediate Code Generation

generate low-level/machine-like intermediate representation


a program for an abstract machine
easy to produce and easy to translate into target machine
An example: three-address code

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 15 / 21
Compiler Structure Intermediate Code Generation and Optimization

Code Optimization

machine-independent/machine-dependent code-optimization
objectives: produce better target code
faster running time
shorter target code
consuming less power
...
An example

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 16 / 21
Compiler Structure Code Generation

Code Generation

map an intermediate representation into target language


instruction selection
register allocation

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 17 / 21
Compiler Structure Symbol-Table Management

Symbol-Table Management

Symbol table:
a data structure recording each variable name and attributes, including
storage, type, scope, ...

should be efficient for finding records and storing or retrieving data


from the records.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 18 / 21
Applications of Compiler Technology

Applications of Compiler Technology


Compiler design is not only about compilers
Applications
Implementation of High-Level Programming Languages
support increased levels of programming abstractions
e.g., data abstraction and inheritance in Java, type system in Rust
Optimizations for Computer Architectures
parallelism: instruction level and processor level
memory hierarchy: closest to the processor being fastest but smallest
Design of New Computer Architectures
RISC architecture
Specialized architecture, e.g., VLIW, SIMD, vector machine, ...
Program Translations
Binary translation: translate binary code for one machine to another
Hardware synthesis: translate RTL descriptions into gates
Database Query Interpreters, Compiled Simulation
Software Productivity Tools
e.g., bug detection, type checking, bounds. checking,
. . . . . . ....
. . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 19 / 21
End-of-Chapter

Compilers: Principles, Techniques, and Tools


Chapter 1 Introduction

Dongjie He
University of New South Wales
https://dongjiehe.github.io/teaching/compiler/

29 Jun 2023

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 20 / 21
End-of-Chapter

Lab 1: Get Familiar with LLVM’s Code Structure

LLVM Project, https://llvm.org/, is a collection of modular and


reusable compiler and toolchain technologies.
LLVM Core: a modern source- and target-independent optimizer
Clang: an LLVM native C/C++/Objective-C compiler
LLD: a linker
...

Task 1: install LLVM on your machine


https://llvm.org/docs/GettingStarted.html#getting-the-source-code-and-building-llvm

Task 2: getting familiar with LLVM’s code structure


. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
UNSW, Sydney Compilers 29 Jun 2023 21 / 21

You might also like