0% found this document useful (0 votes)
80 views

2020 Winter Term 2 Unit 1b - Static Scalars and Arrays: CPSC 213 2020 ST1 © 2020 Jonatan Schroeder

Uploaded by

Stygian Shadow
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

2020 Winter Term 2 Unit 1b - Static Scalars and Arrays: CPSC 213 2020 ST1 © 2020 Jonatan Schroeder

Uploaded by

Stygian Shadow
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

2020 Winter term 2 1

Unit 1b – Static Scalars and Arrays

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder


▪ Reading
▪ Companion: 1, 2-1-2.3, 2.4.1-2.4.3
▪ Textbook: 3.1-3.2.1
▪ Reference (textbook, as needed): 3.1-3.5, 3.8, 3.9.3
▪ Learning Objectives
▪ list the basic components of a simple computer and describe their function
▪ describe ALU functionality in terms of its inputs and outputs
▪ describe the exchange of data between ALU, registers, and memory
▪ identify and describe the basic components of a machine instruction
▪ outline the steps a RISC machine performs to do arithmetic on numbers stored in memory
▪ outline the steps a simple CPU performs when it processes an instruction
▪ translate between array-element offsets and indices
▪ distinguish static and dynamic computation for access to global scalars and arrays in C
▪ translate between C and assembly language for access to global scalars and arrays
▪ translate between C and assembly for code that performs simple arithmetic
▪ explain the difference between arrays in C and Java

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 2


▪ Develop a model of computation
▪ that is rooted in what the machine actually does
▪ by examining C, bit-by-bit (comparing to Java as we go)
▪ The processor
▪ we will design (and you will implement) a simple instruction set
▪ based on what we need to compute C programs
▪ similar to a real instruction set (MIPS)
▪ The language
▪ we will act as compiler to translate C into machine language
▪ bit by bit, then putting the bits together to do interesting things
▪ edit, debug and run using simulated processor to visualize execution

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 3


▪ Java: ▪ C:
public class Foo {
int a;
static int a;
static int[] b; int b[10];
// array not static, skip for now void foo () {
public void foo () { a = 0;
a = 0;
b[a] = a;
b[a] = a;
} }
}
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 4
▪ CPUs execute instructions, not C or Java code
▪ Execution proceeds in three stages
▪ Fetch: load the next instruction from memory
▪ Decode: figure out from instruction what needs to be done
▪ Execute: do what the instruction asks

▪ These stages are looped over and over again forever

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 5


▪ CPU instructions are very simple
▪ Read a value from memory
▪ Write a value to memory
▪ Add/subtract/AND/OR two numbers
▪ Shift a number
▪ Control flow (we’ll talk about these in unit 1d)

▪ Operations are carried out by an ALU (Arithmetic & Logic


Unit)
▪ ALU is used in the Execute stage

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 6


▪ Human creation: design program and describe it in high-
level language
▪ Compilation: convert high-level human description into
machine-executable text
▪ Execution: a physical machine executes the code text

Human
Creation Compilation Execution
Source Object Dynamic
Idea
Code Code State
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 7
▪ Execution
▪ Parameterized by input values unknown at compilation
▪ Producing output values that are unknowable at compilation
▪ Anything the compiler can compute is called static
▪ Anything that can only be discovered when executing is
called dynamic
Human
Creation Compilation Execution
Source Object Dynamic
Idea
Code Code State
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 8
Consider the code below:
int a;
a = 5;

The value assigned to variable a above is:


A. static
B. dynamic

CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 9


Assume that a variable of type int is assigned a value that is
read from the user through an input dialog. This value is:
A. static
B. dynamic

CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 10


▪ Implements a set of instructions
▪ Each instruction is implemented using logic gates
▪ Built from transistors: fundamental mechanism of computation

▪ The fewer and simpler instructions the better (RISC


philosophy)
▪ Opposing view (CISC): more complex instructions allow for shorter
assembly code

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 11


▪ Say we propose an instruction that does: 𝐴 ← 𝐵 + 𝐶
▪ A, B and C are stored in memory

▪ Instruction parameters: addresses of A, B, C


▪ Each address is 32-bits (modern computers: 64-bits)

▪ Instruction is encoded as set of bits (stored in memory):


▪ Operation name (e.g., add)
▪ Addresses for A, B, C

01 00001024 00001028 0000102c


Operation A B C

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 12


▪ Accessing memory is slow
▪ ~100 cycles for every memory access
▪ fast programs avoid accessing memory when possible
▪ Big instructions are costly
▪ Memory addresses are big (so instructions are big)
▪ Big instructions lead to big programs
▪ Reading instructions from memory is slow (less caching options)
▪ Large instructions use more CPU resources (transfer, storage)

01 00001024 00001028 0000102c

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 13


▪ Register file
▪ Small, fast memory stored in CPU itself
▪ Roughly single cycle access

▪ Registers
▪ Each register named by a number (e.g., 0-7)
▪ Size: architecture’s common integer (32 or 64 bits)

01 00001024 00001028 0000102c

Regs
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 14
▪ Memory instructions handle only memory
▪ Load data from memory into register (slow)
▪ Store data from register into memory (slow)

▪ Other instructions access data in registers


▪ Small and fast

▪ To improve instruction size: share register for destination


and one source
01 00001024 00001028 0000102c

Regs
01 0 1 2 01 0 1
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 15
▪ A special-purpose register can only be used for certain
purposes
▪ May not be accessible by all instructions (e.g., can’t be used as
argument for an add instruction)
▪ May have special meaning or be treated specially by CPU

▪ Examples:
▪ PC (Program Counter): address of the next instruction to execute
▪ IR (Instruction Register): instruction that has just been loaded from
memory

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 16


▪ ISA is formal interface to a processor implementation
▪ defines the instructions the processor implements
▪ defines the format of each instruction

▪ Types of instructions:
▪ math and logic
▪ memory access
▪ control transfer: “gotos” and conditional “gotos”

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 17


▪ Design alternatives:
▪ CISC: simplify compiler design (e.g., Intel x86, x86-64)
▪ RISC: simplify processor implementation

▪ Instruction format
▪ sequence of bits: opcode and operand values
▪ all represented as numbers

▪ Assembly language:
▪ Symbolic (textual) representation of machine code

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 18


▪ RTL: simple, convenient pseudo language to describe
semantics
▪ easy to read/write, describes machine steps

▪ Syntax:
▪ each line is of the form: LHS ← RHS
▪ LHS is memory or register that receives a value
▪ RHS is constant, memory, register or expression on two registers
▪ m[a] is memory in address a
▪ r[i] is register with number i

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 19


▪ Register 0 receives 10:
▪ r[0] ← 10

▪ Register 1 receives memory whose address is in register 0:


▪ r[1] ← m[r[0]]

▪ Register 2 is incremented by value of register 1:


▪ r[2] ← r[2] + r[1]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 20


▪ Assume the value 7 is stored at address 0x1234.
▪ What is the result of the following sequence of instructions?
r[0] ← 10
r[1] ← 0x1234
r[2] ← m[r[1]]
r[2] ← r[0] + r[2]
m[r[1]] ← r[2]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 21


▪ Variables are named storage locations for values
▪ Features:
▪ Name
▪ Type/size
▪ Scope
▪ Lifetime
▪ Memory location (address)
▪ Value
▪ Which of these are static? Which are dynamic?
▪ Which are determined or can change while the program is
running?
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 22
▪ In Java & C
▪ Every variable has a type
▪ Compiler enforces type safety: restricts what you can do with a variable
▪ In Java
▪ Strong type safety: references can’t be cast to incompatible types (apparent
type)
▪ Runtime type information: every object carries its actual type in memory
▪ In C
▪ A type defines the size and interpretation of a bunch of bytes
▪ sizeof operator can determine the size at compile-time
▪ No runtime type information: all that matters is how code accesses bytes
▪ In Assembly
▪ No types! Everything is a number! Instructions decide how to interpret them.

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 23


int a; Static Memory Layout
int b[10];
int a; 0x1000: value of a
int b[10]; …
void foo () {
0x2000: value of b[0]
a = 0;
0x2004: value of b[1]
b[a] = a;

}
0x2024: value of b[9]

▪ Allocation is
▪ assigning a memory location to store a specified value (e.g., variable)
▪ associating the variable to the address of that memory location (its name
for reading and writing)
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 24
int a; Static Memory Layout
int b[10];
int a; 0x1000: value of a
int b[10]; …
void foo () {
0x2000: value of b[0]
a = 0;
0x2004: value of b[1]
b[a] = a;

}
0x2024: value of b[9]
▪ Static vs. dynamic computation
▪ Global/static variables can exist before program starts
▪ Compiler allocates static variables, giving them a constant address
▪ No dynamic computation required to allocate the variables, they just exist
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 25
Assume a is a global variable in C. When is space for a allocated? In
other words, when is its address determined?

A. The compiler assigns the address when it compiles the program


B. The compiler calls the memory to allocate a when it compiles the
program
C. The compiler generates code to allocate a before the program starts
running
D. The program locates available space for a before it starts running
E. The program locates available space as soon as the variable is used
for the first time.
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 26
int a; Static Memory Layout
int b[10];
a = 0; 0x1000: value of a

void foo () {
0x2000: value of b[0]
a = 0;
0x2004: value of b[1]
b[a] = a;

}
0x2024: value of b[9]
▪ Key Observation:
▪ Address of a, b[0], b[1], … are constants known to the compiler

▪ Exercise: use RTL to specify instructions needed for a = 0


▪ What if a is assigned a dynamic value? What if it’s assigned another variable?
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 27
Static Memory Layout
What is the proper RTL 0x1000: value of a
instruction for: …
a = 0; 0x2000: value of b[0]
0x2004: value of b[1]
A. r[0x1000] ← 0

0x2024: value of b[9]
B. m[0x1000] ← 0
C. 0x1000 ← r[0]
D. m[r[0x1000]] ← 0
E. 0x1000 ← m[0]

CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 28


int a; Static Memory Layout
int b[10];
b[a] = a; 0x1000: value of a

void foo () {
0x2000: value of b[0]
a = 0;
0x2004: value of b[1]
b[a] = a;

}
0x2024: value of b[9]
▪ Key Observation:
▪ Value of a is usually unknown at compilation time

▪ Even though address of b[0], b[1]… are known statically, which


one is used here is dynamic
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 29
▪ Compiler doesn’t know address of b[a]
▪ Unless it knows the value of a statically (which it might here, but not
in general)
▪ Array access is computed from base and index
▪ Address of element is base plus offset
▪ Offset is index times size of each element
▪ The base address (0x2000) and element size (4, as it is an integer)
are static, but the index value is dynamic (a’s value can change)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 30


int a; Static Memory Layout
int b[10];
b[a] = a; 0x1000: value of a

void foo () {
0x2000: value of b[0]
a = 0;
0x2004: value of b[1]
b[a] = a;

}
0x2024: value of b[9]

▪ Exercise: use RTL to specify the instructions for b[a] = a


▪ Assume that the compiler does not know the value of a

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 31


What is the proper RTL instruction for:
b[a] = a;
Static Memory Layout
A. m[0x2000+m[0x1000]] ← m[0x1000]
B. m[0x2000+4*m[0x1000]] ← m[0x1000]
0x1000: value of a

C. m[0x2000+m[0x1000]] ← 4*m[0x1000]
0x2000: value of b[0]
D. m[0x2000]+4*m[0x1000] ← m[0x1000]
0x2004: value of b[1]
E. m[0x2000]+m[0x1000] ← m[0x1000]

0x2024: value of b[9]

CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 32


What instructions can we use for a = 0?
▪ Option 1: static address and value
m[0x1000] ← 0x0
▪ 9 bytes (1-byte instruction code plus two 4-byte integers)

▪ Option 2: static address, dynamic value


r[0] ← 0x0
m[0x1000] ← r[0]
▪ 5 bytes for immediate value instruction
▪ 5 bytes for memory instruction (instruction code plus one integer)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 33


▪ Option 3: dynamic address and value
r[0] ← 0x0
r[1] ← 0x1000
m[r[1]] ← r[0]
▪ 2 bytes for memory instruction
▪ More flexibility (address and value can be dynamic)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 34


What instructions can we use for b[a] = a?
▪ Option 1: static base address, index and value
m[0x2000+m[0x1000]*4] ← m[0x1000]
▪ Inputs:
▪ Base address (0x2000)
▪ Index address (0x1000)
▪ Input value address (0x1000)
▪ 13 bytes (instruction code plus three integers)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 35


▪ Option 2: calculate address explicitly
r[0] ← 0x1000
r[1] ← m[r[0]]
r[3] ← r[1]*4 (or r[3] ← r[1]<<2 )
r[2] ← 0x2000
r[2] ← r[2]+r[3]
m[r[2]] ← r[1]
▪ Uses same instruction as before
▪ Issue: array accesses are common

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 36


▪ Option 3: dynamic base address, index and value
r[0] ← 0x1000
r[1] ← m[r[0]]
r[2] ← 0x2000
m[r[2]+r[1]*4] ← r[1]
▪ 2 bytes for memory instruction (opcode plus three registers)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 37


01 00001024 00001028 0000102c

Regs
01 0 1
13 bytes
2 bytes

▪ Minimize the number of memory instructions in ISA


▪ At most 1 memory access per instruction
▪ No other operation in the same instruction
▪ Minimize the total number of instructions in ISA
▪ Minimize the size of each instruction

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 38


▪ Requirements for scalars (e.g., assign value to global
variable):
▪ Load constant into a register
r[0] ← v (where v is a constant integer)
▪ Store a value in a register into memory at a constant address
m[r[y]] ← r[x] (assume r[y] stores address of value)
▪ Load value in memory into a register
r[x] ← m[r[y]] (assume r[y] stores address of value)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 39


▪ Additional requirements for arrays
▪ Store value in a register into memory at address in register * 4 plus
base address
▪ m[r[z] + r[x]*4] ← r[y]
▪ Load value in memory at address in register * 4 plus base address
into register
▪ r[y] ← m[r[z] + r[x]*4]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 40


▪ Generalizing and simplifying, we get:

▪ r[x] ← constant

▪ m[r[x]] ← r[y]
▪ r[y] ← m[r[x]]

▪ m[r[x] + r[y]*4] ← r[z]


▪ r[z] ← m[r[x] + r[y]*4]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 41


▪ The compiler’s semantic translation
▪ It uses these instructions to compile the program snippets

Name Semantics Assembly Machine


load immediate r[d] ← v ld $v, rd 0d-- vvvvvvvv
load base+offset r[d] ← m[r[s]] ld (rs), rd 1?sd
load indexed r[d] ← m[r[s]+4*r[i]] ld (rs,ri,4), rd 2sid
store base+offset m[r[d]] ← r[s] st rs, (rd) 3s?d
store indexed m[r[d]+4*r[i]] ← r[s] st rs, (rd,ri,4) 4sdi

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 42


int a; r[0] ← 0
int b[10]; r[1] ← 0x1000
m[r[1]] ← r[0]
void foo () {
a = 0;
r[2] ← m[r[1]]
b[a] = a;
}
r[3] ← 0x2000
m[r[3]+r[2]*4] ← r[2]
Name Semantics Assembly Machine
load immediate r[d] ← v ld $v, rd 0d-- vvvvvvvv
load base+offset r[d] ← m[r[s]] ld (rs), rd 10sd
load indexed r[d] ← m[r[s]+4*r[i]] ld (rs,ri,4), rd 2sid
store base+offset m[r[d]] ← r[s] st rs, (rd) 3s0d
store indexed m[r[d]+4*r[i]] ← r[s] st rs, (rd,ri,4) 4sdi

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 43


int a; r[0] ← 0 ld $0, r0
int b[10]; r[1] ← 0x1000 ld $0x1000, r1
m[r[1]] ← r[0] st r0, (r1)
void foo () {
a = 0;
r[2] ← m[r[1]] ld (r1), r2
b[a] = a;
}
r[3] ← 0x2000 ld $0x2000, r3
m[r[3]+r[2]*4] ← r[2] st r2, (r3, r2, 4)
Name Semantics Assembly Machine
load immediate r[d] ← v ld $v, rd 0d-- vvvvvvvv
load base+offset r[d] ← m[r[s]] ld (rs), rd 10sd
load indexed r[d] ← m[r[s]+4*r[i]] ld (rs,ri,4), rd 2sid
store base+offset m[r[d]] ← r[s] st rs, (rd) 3s0d
store indexed m[r[d]+4*r[i]] ← r[s] st rs, (rd,ri,4) 4sdi

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 44


int a; ld $0, r0 # r0 = 0
int b[10]; ld $0x1000, r1 # r1 = address of a
st r0, (r1) # a = 0
void foo () {
a = 0; ld (r1), r2 # r2 = a
b[a] = a; ld $0x2000, r3 # r3 = address of b
} st r2, (r3, r2, 4) # b[a] = a

.pos 0x1000
a: .long 0 # the variable a
00-- 00000000
01-- 00001000 .pos 0x2000
3001 b_data .long 0 # the variable b[0]
1102 .long 0 # the variable b[1]
03-- 00002000 ...
4232
.long 0 # the variable b[9]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 45


▪ Architecture
▪ Register file: 8 general purpose registers, each 32-bits long
▪ CPU: One cycle per instruction (fetch and execute)
▪ Main memory: byte-addressed, big endian

▪ Instruction format
▪ 2-byte or 6-byte instructions

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 46


▪ Binary format represented as (each character is a hex digit):
x-01, xxsd, x0vv, x-sd vvvvvvvv
▪ Where:
▪ x is an opcode (instruction unique identification)
▪ - means unused portion (ignored by instruction)
▪ s and d are register numbers
▪ vv and vvvvvvvv are immediate/constant values

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 47


▪ Set of lines containing:
[label:] [instruction|directive] [# comment]
▪ where:
directive :: (.pos number) | (.long number)
instruction :: opcode operand+
operand :: $literal|reg|offset(reg)|(reg,reg,4)
reg :: r 0..7
literal :: number
offset :: number
number :: decimal | 0x hex
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 48
▪ Operands have 4 possible addressing modes:
▪ Immediate: constant value obtained from instruction itself
▪ Register: operand is register number; register stores value
▪ Base+Offset: operands are register number and optional offset; register
stores memory address (via offset) of value; offset o=p×4
▪ Indexed: operands are two register numbers; registers store based
address and index of value

Name Semantics Assembly Machine


load immediate r[d] ← v ld $v, rd 0d-- vvvvvvvv
load base+offset r[d] ← m[r[s]+4*p] ld o(rs), rd 1psd
load indexed r[d] ← m[r[s]+4*r[i]] ld (rs,ri,4), rd 2sid
store base+offset m[r[d]+4*p] ← r[s] st rs, o(rd) 3spd
store indexed m[r[d]+4*r[i]] ← r[s] st rs, (rd,ri,4) 4sdi
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 49
Name Semantics Assembly Machine
register move r[d] ← r[s] mov rs, rd 60sd
add r[d] ← r[d] + r[s] add rs, rd 61sd
and r[d] ← r[d] & r[s] and rs, rd 62sd
inc r[d] ← r[d] + 1 inc rd 63-d
inc address r[d] ← r[d] + 4 inca rd 64-d
dec r[d] ← r[d] – 1 dec rd 65-d
dec address r[d] ← r[d] – 4 deca rd 66-d
not r[d] ← ~ r[d] not rd 67-d
shift left r[d] ← r[d] << s shl $s, rd 7dss
shift right r[d] ← r[d] >> -s shr $s, rd 7dss
halt Halt the machine halt F0--
nop Do nothing Nop FF--

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 50


What instruction is A. ld (r4,r4,4), r6
represented by the bytes: B. st r4, (r6,r2,4)
0x46 0x24?
C. st r6, (r2,r4,4)
D. st r6, 2(r4)
E. st r6, 8(r4)
Name Semantics Assembly Machine
load immediate r[d] ← v ld $v, rd 0d-- vvvvvvvv
load base+offset r[d] ← m[r[s]+4*p] ld o(rs), rd 1psd
load indexed r[d] ← m[r[s]+4*r[i]] ld (rs,ri,4), rd 2sid
store base+offset m[r[d]+4*p] ← r[s] st rs, o(rd) 3spd
store indexed m[r[d]+4*r[i]] ← r[s] st rs, (rd,ri,4) 4sdi
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 51
What instruction is A. shl $5, r8
represented by the bytes: B. shl $8, r5
0x75 0xF8?
C. shr $5, r8
D. shr $8, r5
E. shl $0xf8, r5

Name Semantics Assembly Machine


shift left r[d] ← r[d] << s shl $s, rd 7dss
shift right r[d] ← r[d] >> -s shr $s, rd 7dss
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 52
▪ PC (program counter): address of next instruction to fetch
▪ Instruction: value of the current instruction
▪ separated into components:

Ins Op Code
Ins Op 0
Ins Op 1
Ins Op 2

Ins Op Imm Ins Op Ext


CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 53
▪ Fetch stage:
▪ Read instruction at PC
▪ Determine size
▪ Separate components
▪ Update PC

▪ Execute stage:
▪ Read internal state and/or memory
▪ Perform specified computation
▪ Update internal state and/or memory

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 54


▪ Internal registers:
▪ instruction, insOpCode, insOp0, insOp1, insOp2, insOpImm,
insOpExt, pc
▪ Read with get(), change with set(value)

▪ General purpose registers:


▪ Read with reg.get(regNumber)
▪ Change reg.set(regNumber, value)

▪ Main memory:
▪ Read with mem.readInteger(address)
▪ Change with mem.writeInteger(address, value)
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 55
▪ Java:
public class Foo {
static int a;
static int[] b = new int[10];
public void foo () {
b[a] = a; ▪ C:
} int a;
} int* b;
void foo () {
b = malloc(10*sizeof(int));
b[a] = a;
}
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 56
▪ Arrays in Java
▪ Store reference to array allocated dynamically with new statement
int[] b = new int[10];
▪ Arrays in C
▪ Can store static arrays
int bst[10];
▪ Can store pointers to other arrays
int *bptr = &bst[0]; // or int *bptr = bst;
▪ Can store pointers to arrays allocated dynamically with call to malloc
library function
int *bdyn = malloc(10 * sizeof(int));

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 57


▪ Java references are equivalent to C pointers
▪ We will use term pointers to refer to either

▪ Variable type: pointer to element type


▪ Example: pointer to int (represented as int *)
▪ Stored internally as an address, regardless of element type

▪ Any pointer can be cast to any pointer type


▪ C performs no array bounds checking
▪ Out-of-bounds access manipulates memory that is not part of array
▪ Performance is faster, but source of vulnerability

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 58


▪ Dynamic allocation with call to malloc
▪ Returns block of bytes with no associated type or
initialization
▪ Element type and dereferencing determines how the data is
interpreted
▪ Dynamic allocation will be discussed in more details in
future units

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 59


▪ Arrays are accessed in C the same way, but declared and
allocated differently

▪ Static allocation: ▪ Dynamic allocation:


int b[10]; int *b = malloc(10 * sizeof(int));

0x2000: value of b[0] 0x2000: value of b


0x2004: value of b[1]

0x2024: value of b[9]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 60


▪ When the program runs:

▪ Static allocation: ▪ Dynamic allocation:


int b[10]; int *b = malloc(10 * sizeof(int));

0x2000: value of b[0] 0x2000: b = 0x3000


0x2004: value of b[1]
… 0x3000: value of b[0]
0x2024: value of b[9] 0x3004: value of b[1]

0x3024: value of b[9]
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 61
▪ Assembly/machine code requires an additional load
b[a] = a;
▪ Static allocation: ▪ Dynamic allocation:
r[0] ← 0x1000 r[0] ← 0x1000
r[1] ← m[r[0]] r[1] ← m[r[0]]
r[2] ← 0x2000 r[2] ← 0x2000
m[r[2]+r[1]*4] ← r[1] r[3] ← m[r[2]]
m[r[3]+r[1]*4] ← r[1]

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 62


▪ Static allocation: ▪ Dynamic allocation:
ld $a, r0 # r0 = address of a ld $a, r0 # r0 = address of a
ld (r0), r1 # r1 = a ld (r0), r1 # r1 = a
ld $b, r2 # r2 = address of b ld $b, r2 # r2 = address of b
st r1, (r2, r1, 4) # b[a] = a For (r2),
ld simulator
r3 (since
# r3it=has
b no malloc):
= &b[0]
st
… r1, (r3, r1, 4) # b[a] = a
.pos 0x1000 .pos 0x2000
a: .long 0 # the variable a .pos 0x1000
b: .long b_data # malloc result
.pos 0x2000 a:
.pos.long 0 # the variable a
0x3000
b: .long 0 # the variable b[0] .pos 0x2000
b_data: .long 0 # b[0]
.long 0 # the variable b[1] b: .long .long
??? # 0the variable b
# b[1]
... ...
.long 0 # the variable b[9] .long 0 # b[9]
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 63
How many memory load operations are required to handle
this instruction?
b[a[0]] = a[b[0]];

A. 2
B. 3
C. 4
D. 5
E. I don’t know
CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 64
▪ In Java and C:
▪ An array is a list of items of the same type
▪ Array elements are named by non-negative integers starting at 0
▪ Syntax for accessing element i of array b is b[i]

▪ In Java:
▪ Variable stores a pointer to the array

▪ In C
▪ Variable can store a pointer to the array or the array itself
▪ Distinguished by variable type

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 65


▪ Pointers are declared as:
type *varname;
▪ Pointers may be accessed as:
varname[0] or *varname
varname[i] or *(varname+i)
▪ The address of a variable can be obtained with:
type a;
type *ptr = &a;

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 66


▪ Adding an integer to a pointer:
▪ integer is multiplied by type size before adding
▪ Example: if a is a pointer of int, then (a+i) is equivalent to 4i bytes
ahead of a
▪ Works with subtraction as well

▪ Subtracting two pointers of the same type


▪ returns the number of elements of that type between the addresses
▪ equivalent to dividing address difference (in bytes) by type size
▪ Example: &a[7] - &a[2] == 5

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 67


int *c; Which element of the
original 10-element array
is modified with the
void foo () { instructions on the left?
c = malloc(10*sizeof(int));
c = &c[3]; A. Element with index 3
B. Element with index 6
*c = *&c[3];
C. Element with index 0
}
D. No element is modified
E. More than one element
is modified

CPSC 213 2020 WT2 © 2021 Jonatan Schroeder 68


ld $0x2000, r0 # r0 = address of c
ld (r0), r1 # r1 = c (malloc result)
ld $12, r2 # r2 = 3*sizeof(int)
add r1, r2 # r2 = c+3
st r2, (r0) # c = c+3

ld $3, r3 # r3 = 3
ld (r2,r3,4), r4 # r4 = c[3]
st r4, (r2) # *c = c[3]

.pos 0x2000
c: .long 0 # or some data used in simulator to emulate malloc

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 69


▪ It’s possible to declare variables as pointers to any types,
even pointers!
int **q;
▪ q is a pointer to an int pointer (int **), or a “double-pointer” to int

▪ When declaring multiple variables in a line, pointers apply


individually:
int *p, i, **q;
▪ p is a pointer to an int (int *)
▪ i is an int
▪ q is a pointer to an int pointer (int **)

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 70


▪ In C, Strings are implemented as arrays of characters
▪ Each element is a one-byte integer (type char), representing the
character as an entry in the ASCII table
▪ E.g., ‘A’ is represented as 65, ‘a’ as 97, ‘0’ as 48, ‘ ’ (space) as 32
▪ The value 0 (‘\0’) indicates the end of the string
▪ So the string “CPSC 213” is represented as:
‘C’ ‘P’ ‘S’ ‘C’ ‘’ ‘2’ ‘1’ ‘3’ ‘\0’
67 80 83 67 32 50 49 51 0
▪ Special characters:
▪ ‘\n’: line break
▪ ‘\r’: carriage return (back to start of line, but don’t move to next
line)
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 71
▪ The main function in C typically receives two arguments:
int main(int argc, char *argv[]) { …
▪ argc: the number of arguments
▪ argv: an array of strings, each representing an argument
▪ argv[0] representes the program name itself, so argc is always >=1
▪ So if you call your program:
./myprogram -x 7
▪ Your program will receive:
▪ argc = 3
▪ argv[0] = “./myprogram”, argv[1] = “-x”, argv[2] = “7”
CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 72
▪ To output text to the console, use printf
▪ First argument: format
▪ Remaining arguments replace placeholders in format, in same order
▪ %s: string
▪ %d or %i: signed integer in base 10
▪ %u: unsigned integer in base 10
▪ %x: hexadecimal integer
▪ E.g., printf("My grade in %s was %d.\n", "CPSC 213", g);
▪ To read text from the console, use scanf
▪ Similar arguments as printf
▪ Arguments must be pointers to destination variables

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 73


▪ Static scalars and arrays:
▪ Compiler knows the address (memory location) of variable
▪ Dynamic arrays:
▪ Compiler does not know address of array directly
▪ What C does that Java doesn’t
▪ Static arrays
▪ Array access using pointer dereferencing
▪ Pointer arithmetic
▪ What Java does that C doesn’t
▪ Type-safe dynamic allocation
▪ Automatic array-bounds checking

CPSC 213 2020 ST1 © 2020 Jonatan Schroeder 74

You might also like