0% found this document useful (0 votes)
15 views31 pages

Csc 213 Lecture Notes 3

This document provides an overview of assembly language programming, detailing its low-level nature, advantages, and the structure of assembly programs. It explains the fetch-decode-execute cycle, memory addressing, and the environment needed for assembly language development, particularly focusing on Intel 32 processors and tools like NASM and MARS. Additionally, it covers basic syntax, types of assembly statements, and provides examples of a 'Hello World!' program in assembly language.

Uploaded by

nelainfubara64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views31 pages

Csc 213 Lecture Notes 3

This document provides an overview of assembly language programming, detailing its low-level nature, advantages, and the structure of assembly programs. It explains the fetch-decode-execute cycle, memory addressing, and the environment needed for assembly language development, particularly focusing on Intel 32 processors and tools like NASM and MARS. Additionally, it covers basic syntax, types of assembly statements, and provides examples of a 'Hello World!' program in assembly language.

Uploaded by

nelainfubara64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

FEDERAL UNIVERSITY OF

PETROLEUM RESOURCES
EFFURUN

CSC 213

FOUNDATIONS OF SEQUENTIAL
PROGRAMMMING

LECTURE NOTES 3:
ASSEMBLY LANGUAGE PROGRAMMING

Dr. A. A. Ojugo
Dr. D. A. Oyemade

1
FEDERAL UNIVERSITY OF
PEROLEUM RESOURCES
EFFURUN
Assembly Language
Assembly language is a low-level programming language for a computer,
or other programmable device specific to a particular computer
architecture in contrast to most high-level programming languages, which
are generally portable across multiple systems. Assembly language is
converted into executable machine code by a utility program referred to as
an assembler like MIPS, NASM, MASM etc.

Each personal computer has a microprocessor that manages the


computer's arithmetical, logical and control activities. Each family of
processors has its own set of instructions for handling various operations
like getting input from keyboard, displaying information on screen and
performing various other jobs. These set of instructions are called
'machine language instruction'. Processor understands only machine
language instructions which are strings of 1s and 0s. However machine
language is too obscure and complex for using in software development.
So the low level assembly language is designed for a specific family of
processors that represents various instructions in symbolic code and a
more understandable form.

2
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN

 Advantages
i. Interface of programs with OS, processor and BIOS;
ii. Representation of data in memory and other external devices;
iii. How processor accesses and executes instruction;
iv. How instructions accesses and process data;
v. How a program access external devices.
vi. It requires less memory and execution time;
vii.It allows hardware-specific complex jobs in an easier way;
viii.It is suitable for time-critical jobs;
 Good understanding of Binary and Hexadecimal System is
required and mandatory.

3
2/9/18
FEDERAL UNIVERSITY OF
Addressing data in memory
PETROLEUM RESOURCES
 The FDXEFFURUN
(fetch-decode-execute) cycle helps the microprocessor to control instruction
execution via 3-steps: (a) fetch instructions from memory, (b) decode instructions,
and (c) execute instructions. The processor can access one or more memory bytes at
a time. Consider a Hex number 0725H that requires 2-bytes of memory to store. The
high-order byte (most significant byte) is 07 and low order byte is 25. Processor stores
the data in reverse-byte sequence (i.e. low-order byte is stored in low memory
address and high-order byte in high memory address). Thus, processor stores 25 first
to lower memory address and 07 to next memory address. When the processor gets
the numeric data from memory to register, it again reverses the bytes (as in fig
below). There are two kinds of memory addresses: (a) an absolute address - a direct
reference of specific location, and (b) segment address (or offset) - starting address of
a memory segment with the offset value.
 Assembly Language Environment: Assembly language is dependent upon the
instruction set and the architecture of the processor. In this tutorial, we focus on Intel
32 processors like Pentium. To follow this tutorial, you will need: (i) a PC, (ii) a
Windows or equivalent OS (operating system), and (iii) a copy of NASM assembler
program. There are many good assembler programs such as: Microsoft Assembler
(MASM), Borland Turbo Assembler (TASM), and GNU assembler (GAS). We adopt the
MIPS (and NASM assembler because – it is: (a) Free. You can download it from various
web sources, (b) Well documented and you will get lots of information on net, and (c)
could be used on both Linux and Windows).

 Specifically, we shall use MARS 4.5


 ItMARS (MIPS Assembler
can be downloaded and Runtime
free from: Simulator) is a development tool
that provides MIPS programmers with an intuitive environment for
https://www.softpedia.com/get/Programming/Coding-languages-Compilers/Vollmar-MARS.shtml
creating and testing software programs
4
FEDERAL UNIVERSITY OF
The Assembly Language
PETROLEUM RESOURCES Basic
Syntax EFFURUN
Assembly language is divided into 3-sections namely:
 The data Section
Is used for declaring initialized data or constants. This data does not
change at runtime. You can declare various constant values, file names or
buffer size etc. here. Its syntax is:
section .data
The bss Section
The bss section is used for declaring variables. The syntax for declaring
bss section is:
section .bss
 The text Section
This section is keeps the actual code and this section must begin with the
declaration global main, which tells the kernel where the program
execution begins. The syntax for declaring text section is:
section .text
global main
main:
 Comments
Comment begins with a semicolon (;) and may contain any printable character
including blank. It can appear on a line by itself, like:
; This program displays a message on screen 5
2/9/18
Or, you can use the structure where it appears on the same line along with an
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
Assembly Language Statements
EFFURUN

Assembly programs consist of 3-statement types: (a) executable


instructions or instructions, (b) assembler directives or pseudo-ops,
and (c) Macros.
a. Executable instructions that tells a processor what to do.
Each instruction consists of an operation code (opcode). Each
executable instruction generates one machine language
instruction.
b. An assembler directives or pseudo-ops tells the assembler
about the various aspects of the assembly process. These are
non-executable and do not generate machine language
instructions.
c. The macros are basically a text substitution mechanism.
 Syntax Structure of Assembly Language Statements
Assembly language statements are entered one statement per line.
Each statement follows the following format:
[label] mnemonic [operands] [;comment]
The fields in the square brackets are optional. A basic instruction
has two parts, the first one is the name of the instruction (or the
6
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
Machine Programming Via Assembly
0 1 2 3 4 5 6 7
Lang.
 The Hello World! 0 NUL DLE space 0 @ P ` p
First, you will need a hex editor. You can get a
good, free program from: 1 SCH DC1 ! 1 A Q a q

www.hhdsoftware.com/free-hex-editor. You also


2 STX DC2 “ 2 B R b r
need ASCII table (see: www.asciitable.com). It will
look as thus: letter 'H' is represented by binary 3 ETX DC3 # 3 C S c s
(hex) value 48. Our programs will use the old-
fashioned but simple “.com” format for 4 EQT DC4 $ 4 D T d t
executable files. This format contains nothing but
code and data. The executable file is loaded into 5 ENQ NAK % 5 E U e u
memory, beginning at memory address 0100. The
explanations of Reg. and instructions will come 6 ACK SYN & 6 F V f v
soon.
i. Load location of string into the CPU register 7 BEL ETB ‘ 7 G W g w
“dx”.
8 BS CAN ( 8 H X h x
ii. Load the number of the DOS service that
prints a string to the console into register
9 HT EM ) 9 I Y i y
“ah”. This is service number 9 “print string”.
A LF SUB * : J Z j Z
iii. Interrupt: causes DOS to do something, in this
case, to execute the “print string” command.
B VT ESC + ; K [ k {
iv. Load exit command 4c, with error code 00 into
Reg “ax” C FF FS , < L \ l 7 |
v. Interrupt: exits program, returns to command
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
 The Hello World! …(Contd.)
EFFURUN
The steps are shown both in assembly language and machine code. The memory
addresses are on the left. The first memory address shown is 0100 (as where the
program will be loaded into memory). However, you will enter this code in the hex-
editor beginning at address 0000 in file.
Address Assembly Lang. Machine Language
0100 Mov dx, 010ch ba 0c 01 Location of string in dx
0103 Mov ah, 09 b4 09 DOS command 09 into ah
0105 int 21h cd 21 Interrupt
0107 Mov ax, 4c00h b8 00 4c Exit, error code 0
010A int 21h cd 21 Interrupt
010C db ‘Hello, World!’, 48 65 6c 6f 2c 20 57
‘$’ 6f
72 6c 64 21 0d 0a 24

i. Look at the second & third bytes in the program “0c01”. These represent the
address of the string we want to print, which is 010c. For ancient historical
reasons, addresses are entered with the bytes reversed, so that the least-
significant byte is first1. Hence, this represents the memory address 010c. The
same thing occurs when we load the register ax with the DOS command “4c”
and the error code 00: the bytes are reversed in the actual machine code.
ii. The string “Hello, World!” ends with a new-line. In Windows/DOS a new-line
always consists of two characters: a “line-feed” and a “return”. Finally, the DOS-
8
service to print text looks for a '$' to tell it when to stop printing; hence, this is
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN

 The Hello World! …(Contd.)


Once entered into the hex-editor, you should see something like this:
000000 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e
1c 0f
00000000 ba 0c 01 b4 09 cd 21 b8 00 4c cd 21 48 65 6c 6c
0000001c 6f 2c 20 57 6f 72 6c 64 21 0d 0a 24 … … … …
i. Save the file as “helloworld.com”.
ii. When you run the program in a command-line window, you should see the result
shown an the right: That's your first program!

9
2/9/18
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
 Another Version of Hello World! – Using the NASM Compiler (we adopt MIPS
& NASM)
section .text
global main ;must be declared for linker (ld)
main: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel

section .data
msg db 'Hello, world!', 0xa ;our dear string
len equ $ - msg ;length of our dear string

 Compiling and Linking an Assembly Program in NASM


Make sure you have set the path of nasm and ld binaries in your PATH environment variable.
Now do this:
i. Type the above code using a text editor and save it as hello.asm.
ii. From same directory as where you saved hello.asm, assemble program by typing nasm -f
elf hello.asm
iii. If there is error, you will be prompted at this stage. Else, an object file named hello.o10will
be created.
FEDERAL UNIVERSITY OF
Assembly Lang. RESOURCES
PETROLEUM Memory Segment
EFFURUN
The 3-sections of an assembly program represent various memory segments. So,
if we replace the section keyword with segment, you get same result. Try the
following code:
segment .text ;code segment
global main ;must be declared for linker
main: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
segment .data ;data segment
msg db Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
 Memory Segments
A segmented memory model divides the system memory into groups of
independent segments, referenced by pointers located in the segment registers.
Each segment is used to contain a specific type of data. We can specify various
memory segments as:
 Data segment represents .data section. (is used to declare memory region
for data elements stored for the program; And thus, the region cannot
2/9/18
11 be
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
Assembly Registers
 Processor operations are involved in processing data, which can be
stored/accessed from memory. But, reading data from and storing data into
memory slows down the processor, as it involves complicated processes of
sending the data request across the control bus, and into the memory storage
unit and getting the data through the same channel. To speed up the processor
operations, the processor includes some internal memory storage locations
called registers, which stores data elements for processing without having to
access the memory. A limited number of registers are built into the processor
chip.
 There are ten 32-bit and six 16-bit processor registers in IA-32 architecture. And
they are grouped into: (a) general (further grouped into data, pointer and index)
registers, (b) control registers, and (c) segment registers
• Data Reg.: Four 32-bit data Reg are used for arithmetic, logical & other
operations. They are used in 3-ways: (i) as complete 32-bit Reg: EAX, ECX, EDX,
(ii) the lower halves of the 32-bit Reg. can be used as four 16-bit data Reg: AX,
BX, Cx and DX, and (iii) lower & higher halves of four 16-bit Reg is used as eight
8-bit data Reg: AH, AL, BH, BL, CH, CL, DH & DL – as in fig below.

12
2/9/18
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
Assembly Registers…Contd
EFFURUN
Some of these data registers has specific use in arithmetical operations.
1. AX is primary accumulator (for I/O & most arithmetic instructions). For MUL, one
operand is stored in EAX, AX or AL Reg. based on size of operand.
2. BX is the base register used in indexed addressing.
3. CX is count register. Thus, ECX, CX registers store the loop count in iterative
operations.
4. DX is data register for I/O operations. It uses AX along with DX for MUL/DIV tasks for
large values.

 Pointer Reg: are 32-bit EIP, ESP & EBP used with its 16-bit right portions categorized into:
1. Instr. Pointer (IP) - the 16-bit IP register stores offset address of next instruction to be
executed. IP used with CS Reg (as CS:IP) gives the complete address of current
instruction in the code segment.
2. Stack Pointer (SP) - the 16-bit SP register provides the offset value within the program
stack. SP used with SS Reg (SS:SP) refers to be current position of data or address within
the program stack.
3. Base Pointer (BP) - the 16-bit BP register helps in referencing the parameter variables
passed to a subroutine. The address in SS register is combined with the offset in BP to
get the location of the parameter. BP can also be combined with DI and SI as base
register for special addressing.
 Index Register
The 32-bit index registers ESI and EDI and their 16-bit right portions SI & DI used for
indexed addressing and sometimes for addition and subtraction. They include:
1. Source Index (SI) - it is used as source index for string operations
2. Destination Index (DI) - it is used as destination index for string operations.13
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
Assembly Registers…Contd
EFFURUN
 Control Registers – are a combination of the 32-bit IP-registers and 32-bit flags. Many
instructions involve comparisons and mathematical calculations, which changes the status
of the flags and some other conditional instructions test the value of these status flags to
take the control flow to other location. The common flag bits are:
1. Overflow Flag indicates the overflow of a high-order bit (leftmost bit) of data after a signed
arithmetic operation.
2. Direction Flag determines left or right direction for moving or comparing string data. When
the DF value is 0, the string operation takes left-to-right direction and when the value is set
to 1, the string operation takes right-to-left direction.
3. Interrupt Flag determines whether the external interrupts like, keyboard entry etc. are to be
ignored or processed. It disables the external interrupt when the value is 0 and enables
interrupts when set to 1.
4. Trap Flag allows setting the operation of the processor in single-step mode. The DEBUG
program we used sets the trap flag, so we could step through the execution one instruction
at a time.
5. Sign Flag shows sign of the result of an arithmetic operation. The sign is indicated by the
high-order of leftmost bit. A positive result clears the value of SF to 0 and negative result
sets it to 1.
6. Zero Flag shows result of an arithmetic or comparison task. A nonzero result sets flag to 0. A
zero result sets it to 1.
7. Auxiliary Carry Flag (AF): contains the carry from bit 3 to bit 4 following an arithmetic
operation; used for specialized arithmetic. The AF is set when a 1-byte arithmetic operation
causes a carry from bit 3 into bit 4.
8. Parity Flag (PF): indicates the total number of 1-bits in the result obtained from an
arithmetic operation. An even number of 1-bits clears the parity flag to 0 and an odd
number of 1-bits sets the parity flag to 1.
9. Carry Flag (CF): contains the carry 0 or 1 from a high-order bit (leftmost) after an arithmetic
operation. It also stores the contents of last bit of a shift or rotate operation. 14
Assembly Registers…
FEDERAL UNIVERSITY OF
Contd PETROLEUM RESOURCES
EFFURUN
Flags O D I T S Z A P C

Bit No: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

 Segment Reg:
These are specific specific areas defined in a program for containing data, code and
stack. There are three main segments:
i. Code Segment: it contains all the instructions to be executed. A 16 - bit Code
Segment register or CS register stores the starting address of the code segment.
ii. Data Segment: it contains data, constants and work areas. A 16 - bit Data
Segment register of DS register stores the starting address of the data segment.
iii. Stack Segment: it contains data and return addresses of procedures or
subroutines. It is implemented as a 'stack' data structure. The Stack Segment
register or SS register stores the starting address of the stack.

Apart from the DS, CS and SS registers, there are other extra segment registers - ES
(extra segment), FS and GS, which provides additional segments for storing data. In
assembly programming, a program needs to access the memory locations. All
memory locations within a segment are relative to the starting address of the
segment. A segment begins in an address evenly disable by 16 or hexadecimal 10.
So all the rightmost hex digit in all such memory addresses is 0, which is not
generally stored in the segment registers. The segment registers stores the starting
addresses of a segment. To get the exact location of data or instruction within a
segment, an offset value (or displacement) is required. To reference any memory
15
location in a segment, the processor combines the segment address in the segment
Assembly Registers…
FEDERAL UNIVERSITY OF
Contd PETROLEUM RESOURCES
Look atEFFURUN
the following simple program to understand the use of registers in assembly
programming. This program displays 9 stars on the screen along with a simple
message:
section .text
global main ;must be declared for linker (gcc)
main: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov edx,9 ;message length
mov ecx,s2 ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Displaying 9 stars',0xa ;a message
len equ $ - msg ;length of message
s2 times 9 db '*'

When the above code is compiled and executed, it produces following result: 16
FEDERAL UNIVERSITY OF
Addressing Modes
PETROLEUM & Memory
RESOURCES
 Most EFFURUN
assembly instructions require operands to be processed. An
operand address provides location where the data to be processed is
stored. Some instructions do not require an operand, whereas other
instructions may require 1, 2 or 3 operands. If an instruction requires 2-
operands, the first operand is its destination and contains data in a
register or memory location; while, the second operand is the source.
Source contains data to be delivered (immediate address mode), or an
address (register or memory) of the data. Generally, the source data
remains unaltered after an operation. The various addressing modes
are explained as thus:

 Register Mode: Here, a register contains the operand. Depending


upon the instruction, the register may be the first operand, the second
operand or both. E.g.
MOV DX, TAX_RATE ; Register in first operand
MOV COUNT, CX ; Register in second operand
MOV EAX, EBX ; Both the operands are in registers
As processing data between registers does not involve memory, it
provides fastest processing of data.

 Immediate Mode: Here, an immediate operand has a constant value


17
2/9/18
or an expression. When an instruction with two operands uses
 Direct
FEDERALMode:
UNIVERSITY OFWhen operands are specified in
PETROLEUM RESOURCES
memory
EFFURUN addressing mode, direct access to main

memory (i.e. to data segment) is required. This mode


results in slower processing. To locate the exact
location of data in memory, we need the segment
start address found in DS and an offset value (or
effective address – EAR). EAR value is specified
directly as part of instruction by variable name. The
assembler calculates the offset value (EAR) and
maintains a symbol table, which stores the offset
values of all the variables used in the program. In
direct memory addressing, one of the operands refers
to a memory location and the other operand
references a register. For example,
ADD BYTE_VALUE, DL ; Adds the register in the
memory location
MOV BX, WORD_VALUE ; Operand from the memory
is added to register 18
2/9/18
 MOV FEDERAL
movesUNIVERSITY
data fromOF one location to another. It requires two
PETROLEUM RESOURCES
operands with the syntax: MOV destination, source and has any
EFFURUN
of the 5-form:
MOV register, register
MOV register, immediate
MOV memory, immediate
MOV register, memory
MOV memory, register

Note: (a) both operands must be of same size, and (b) source
operand value will remain unchanged after the operation. The MOV
can cause ambiguity at times. E.g., the statements below is
unclear if the user want to move a byte- or word-equivalent of
number 110. Thus, it is wise to use a type specifier.
MOV EBX, [MY_TABLE] ; Effective
Type Specifier
Address of MY_TABLE in EBX
Bytes Addressed
MOV [EBX], 110 ; MY_TABLE[0] = 110
Byte 1
Word 2

Common
Double-Word type specifiers
4 include:
Quad-Word 8
Tera-Byte 10
19
 EXAMPLE: FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
The following
EFFURUNprogram illustrates some of the concepts discussed
above. It stores a name 'Zara Ali' in the data section of the
memory. Then changes its value to another name 'Nuha Ali'
programmatically and displays both the names.
section .text
global main ;must be declared for linker (ld)
main: ;tell linker entry point
;writing the name ‘Arnold Ojugo'
mov edx,9 ;message length
mov ecx, name ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov [name], dword ‘Arnold' ; Changed the name to Arnold
Ojugo
;writing the name ‘Arnold Ojugo'
mov edx,8 ;message length
mov ecx,name ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
20
2/9/18 int 0x80 ;call kernel
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
Assemblers: Supplementary Notes
 Getting Started: The MIPS CPU uses 32-bit words since it's a 32-bit machine, and it's big-
endian. You can use xxd to inspect MIPS les. MIPS has 32 registers (numbered 0 to 31). At
the end of our MIPS programs, we will copy the contents of register $31 to the program
counter (PC) to \return.“
 Running MIPS Programs: Upon logging in, run source ~cs241/setup in order to add the
required executables to your PATH. Then, when given a MIPS executable called eg0.mips,
you can run java mips.twoints eg0.mips in order to run the program. mips.twoints is a
Java program that requests values for registers $1 and $2 and then runs the given MIPS
program. There are other MIPS runner programs, such as mips.array, which populate the
31 registers in different ways.
 Creating MIPS Programs: Start with vi thing.asm (or use your favorite editor). Inside this
file, you'll create an assembly language file, which is a textual representation of the binary
file you want to create. Each line in this file should be in the form .word 0xabcdef12 (that
is, each line should start with .word 0x { the 0x is a convention that indicates that hex
follows). You can add comments onto the end of lines, starting with a semi-colon (Scheme
style). Next, you'll need to convert your assembly language file into a binary file. You can
do that by running java cs241.wordasm < thing.asm > thing.bin. You can then inspect
thing.bin with xxd in hex, or in binary if you're masochistic. A few important things you
should know for developing MIPS programs: (a) $0 is a register that will always contain 0.
It's special like that, (b) $30 points to memory that could be used as a stack, (c) $31 will
be copied to the program counter at the end of execution in order to \return, (d) You can
specify register values using base 10 values or as hex values (if prefixed by 0x), (e) It
takes 5-bits to specify a register in binary MIPS instructions, since 25 = 32, (f) it is
conventional to call S and T source registers, and D – destination register, (g) MIPS uses
two's
2/9/18 complement numbers by default, unless specified otherwise, (h) Loops 21and
conditionals are accomplished by adding or subtracting from the PC, (i) There is a MIPS
FEDERAL UNIVERSITY OF
MIPS Program Workflow
PETROLEUM RESOURCES
EFFURUN
 A Few Important MIPS Instructions
1. Load Immediate & Skip (lis): loads the next word of memory into the D register. You
specify a lis instruction followed by an arbitrary word next. The PC will then skip past
the arbitrary word that follows, to avoid executing it.
2. Set Less Than [Unsigned] (slt): compares S to T. If S < T, 1 is put into the D register,
otherwise 0 is put into the D register.
3. Jump Register (jr): copies the value in the source register S to the program counter.
4. Jump and Link Register (jalr): assigns the program counter to register 31, then jumps to
it.
5. Branch on Equal (beq): if S is equal to T, it adds the specified number to the program
counter (times 4). There is also Branch on Unequal (bne) which does the opposite.

 MIPS Program Workflow: The MIPS CPU understands binary machine language programs,
however we cannot write them directly. Instead, we write assembly language programs
in text les. By convention, we name these text les with the extension .asm. Assembly
language contains instructions like .word 0x00221820. We feed the assembly language
program into cs213.wordasm, which is an assembler. An assembler translates assembly
language into binary machine code that can be executed. Assembly language can also
look like this: add $3, $1, $2. Assembly language in this form has to be fed into a
different assembler (cs241.binasm) that understands that favor of assembly syntax.
There is a MIPS reference manual available on the course website. It might be useful in
situations such as: (a) when you want to be an assembler yourself. You'll need to lookup
the mapping between assembly instructions like add $3, $1, $2 and their binary
equivalents, (b) when you need to know what's valid assembly code that an assembler
22
will accept, and (c) When you want to write your own assembler you'll need a
FEDERAL UNIVERSITY OF
Format ofPETROLEUM
MIPS Assembly
RESOURCES
 MIPS assembly
EFFURUN code is placed into a text file with this general format: labels
instruction comment
i. Labels are identifier followed by a colon. E.g. fred:, wilma:, x123: are examples of
valid labels.
ii. Instructions are in the form add $3, $1, $2. Consult MIPS reference sheet for MIPS
syntax.
iii.Comments are placed at the end of lines and must be prefixed by a semicolon.
Lines with only comments (still prefixed with a semicolon) are acceptable as well.
For example:
; hello world.
 Note: There is a 1-to-1 correspondence between instructions in assembly and
instructions in machine code. The same MIPS instructions will always produce the
same machine code. Here's a more comprehensive overview of the instructions
available to you in the CSC213 dialect of MIPS. Note that for all of these
instructions, 0 ≤ d; s; t ≤ 31, since there are 32 registers in MIPS numbered from
0 to 31.
 SAMPLE INSTRUCTIONS
 .word – is not really a MIPS instruction in/of itself. It provides us a way to include
arbitrary bits in the assembler's output. Words can be in several different forms
such as:
i. .word 0x12345678 (hex)
ii. .word 123 (decimal)
iii. .word -1 (negative decimals whose representation will eventually be represented in 2’
complement)

23
2/9/18
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
Sample Instruction…
Continued
 lis $d (load immediate and skip). Copies the word from the program counter (the
next word) into $d, adds 4 to PC in order to skip the word you just loaded.
 lw $t, i($s) (load word, -32,768 ≤ i ≤ 32,767). E.g: lw $3, 100($5) will get
contents of $5, add 100, treat the result as an address, fetch a word from RAM at
that address, and put the result into $3.
 sw $t, i($s) (store word, -32,768 ≤ i ≤ 32,767). Works in a similar way to lw,
except it stores the contents of $t at RAM at this address.
 slt $d, $s, $t (set less than). Sets $d to 1 if $s < $t, or to 0 otherwise.
 sltu $d, $s, $t (set less than unsigned). Sets $d to 1 if $s < $t, or to 0 otherwise.
Interprets the numbers as unsigned numbers.
 beq $s, $t, i (branch if equal, -32,768 ≤ i ≤ 32,767). Adds 4i to the PC if $s is
equal to $t. Note that 4 is still added (in addition to adding the 4i for this specific
command) as you move to the next instruction, as with all instructions.
 bne $s, $t, i (branch if not equal, -32,768 ≤ i ≤ 32,767). Works same way as beq,
except it branches if $s is not equal to $t.
 jr $s (jump register). Copies $s to the program counter.
 jalr $s (jump and link register). Copies $s to the program counter and copies the
previous value of the program counter to $31.

Example of a Program that sums from 1 – to – N is as below

24
Sample Instruction…
FEDERAL UNIVERSITY OF
Continued PETROLEUM RESOURCES
EFFURUN
To sums numbers from 1 to n (n is the contents of $1) and store its result in $3.
; $1 is N.
; $3 is the sum.
; $2 is temporary.

add $3, $0, $0 ; zero accumulator

; beginning of loop
add $3, $3, $1 ; add $1 to $3
lis $2 ; decrement $1 by 1
.word -1
add $1, $1, $2
bne $1, $0, -5 ; n = 0? If not, branch to beginning of loop

jr $31 ; return

If we input 10 for $1 (we get sum of numbers from 1 to 10 – i.e. 55). Actual result is
0x00000037. Note: that 3716 = 5510, and program works as expected. End result is
$1 being 0x00000000 (010), $2 being 0xffffffff (-110), and $3 being 0x00000037
(5510). It is a convention we reserve $1 and $2 as registers for input parameters and
$3 as register for result – MIPS itself does not treat these registers in a special way.
25
FEDERAL UNIVERSITY OF
AccessingPETROLEUM
RAM in MIPS
RESOURCES
 RAM Vs. Registers: Key differences between RAM and registers: (a) lots of RAM
EFFURUN
available, but there are a finite number of registers available, (b) You can
compute addresses with RAM, but registers have fixed names that cannot be
computed (e.g. we compute memory address 0x00000008 = 0x00000004 +
0x0000004, but we cannot compute $2), and (c) we can create large, rich data
structures in RAM. Registers provide small, fixed, fast storage.

 Storing in RAM
The code snippet below shows
lis $5
.word 100000
sw $1, 0($5)
lw $3, 0($5)
jr $31
 Example above uses memory address 100000. But: (a) how do we know we have
that much RAM? (b) how do we know it is not already in use by someone else or
another process? This is clearly a bad practice. We really should not just use an
arbitrary memory address without any type of safety checks. Thus, we will
reserve some memory ourselves by adding a word after the last jr instruction,
which means memory will be allocated for the word instruction, however it will
never be executed.
 MIPS requires that we actually specify a word for that space in memory. The
contents of it does not matter. So, we use .word 28234 (i.e. arbitrary) and replace
100000 in with 20. For now, we can assume that our MIPS program will always
run in memory starting at memory address 0, so memory addresses and 27
locations in our code can be treated as being the same. But hard-coding 20 is
Stacks FEDERAL UNIVERSITY OF
 Stack:PETROLEUM RESOURCES register to place the stack pointer (or $sp). SP points to first
$30 is the conventional
EFFURUN
address of RAM reserved for use by other people. Program to store-fetch data using stack:
sw $1, -4($30)
lw $3, -4($30)
jr $31
 All memory with address < the value of $30 can be used by your program. We can use this
method to create 100,000+ memory locations, which is impossible unless we have 100,000
registers and above. It lessens hard-coding of $1, $2…$100000. You can change value of SP;
But, must ensure to change the SP back to its original state before we return (before jr $31).
Another program which sums the numbers from 1 to n without modifying anything except
$3 is as below. It is okay to modify $1 and $2, so long as they are returned to their original
state before returning.
sw $1, -4($30) ; save on stack
sw $2, -8($30) ; save on stack

lis $2
.word 8
sub $30, $30, $2 ; push two words

add $3, $0, $0

; beginning of loop
foo: add $3, $3, $1
lis $2
.word -1
add $1, $1, $2 28
bne $1, $0, foo
FEDERAL UNIVERSITY
The mips.array OF
is a MIPS runner that passes an array A of size N into your MIPS
program.PETROLEUM
The addressRESOURCES
of A will be in $1, and the size of A (which is N) will be in $2.
EFFURUN
To access array elements, you would execute instructions such as these:
lw $3, 0($1)
sw $4, 4($1)

Each array index increases by 4. We can also compute the array index. In C/C++, we
may have an expression A[i]. A is in $1 and i in $3. How can we fetch A[i] into x (i.e.
into $7)?
1. Multiply i by 4.
2. Add to A.
3. Fetch RAM at the resulting address.

add $3, $3, $3


add $3, $3, $3 ; these two lines give i * 4
add $3, $3, $1 ; A + i * 4
lw $7, 0($3)

The two first lines each double the value in $3, so the two lines together effectively
multiplied i by 4. Another program to sum integers in an array A of length N. $1 has
address of A, $2 contains N, and $3 will contain the output (the sum). $4 is used
temporarily.
add $3, $0, $0
loop:
lw $5, 0($1) ; fetch A[i] 29
2/9/18 add $3, $3, $5 ; add A[i] to sum
RecursionFEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
We needEFFURUN
to save any local variables stored in registers such as parameters and
return address into the stack and change them back when we are done. We do
not want recursive calls to change the values. As such, a subroutine must
preserve its values. It is always good to save your registers. We build gcd.asm
(where $1 = a, $2 = b, and $3 = hold the result) as thus:
gcd:
sw $31, -4($30) ; save return address
sw $1, -8($30) ; and parameters
sw $2, -12($30)
lis $4
.word 12
sub $30, $30, $4
add $3, $2, $0 ; tentatively, result = b
beq $1, $0, done ; quit if a = 0
div $2, $1 ; stores quotient in LO, remainder in HI
add $2, $1, $0 ; copy a to $2
mfhi $1 ; $1 <- b % a
lis $4
.word gcd
jalr $4
done:
lis $4
.word 12
add $30, $30, $4 30
FEDERAL UNIVERSITY OF
PETROLEUM RESOURCES
EFFURUN
Input & Output
getchar and putchar simulate RAM by sending data from/to a user's
keyboard/monitor. getchar is located at memory address 0xffff0004 and putchar is
at address 0xffff000c. If you load or store a byte at either of these addresses, you
will retrieve or send the byte from/to standard input (STDIN) or standard output
(STDOUT). We will create an example program, cat.asm, to copy input to output:
lis $1
.word 0xffff0004 ; address of getchar()
lis $3
.word -1 ; EOF signal
loop:
lw $2, 0($1) ; $2 = getchar()
beq $2, $3, quit ; if $2 == EOF, then quit
sw $2, 8($1) ; putchar() since getchar() and putchar() are 8 apart
beq $0, $0, loop
quit: jr $31

31

You might also like