Chapter 3.3 i8086 Programming
Chapter 3.3 i8086 Programming
3
Programming i8086
Assembly language statement source format Symbol types
The location counter Address expressions (later subsections contain
Symbols and identifiers advanced material)
Constants Conditional assembly
Procedure declarations Macros
Segments in an assembly language program Listing directives
Variables Separate assembly
Eng A4KT 2015 1
116-Instructions of i8086 in alphabetic order
Variable is a memory location. For a programmer it is much easier to have some value be kept in a
variable named "var1" then at the address 5A73:235B, especially when you have 10 or more variables.
Our compiler supports two types of variables: BYTE and WORD.
Syntax for a variable declaration:
name DB value
name DW value
DB - stays for Define Byte.
DW - stays for Define Word.
name - can be any letter or digit combination, though it should start with a letter. It's possible to
declare unnamed variables by not specifying the name (this variable will have an address but no
name).
value - can be any numeric value in any supported numbering system (hexadecimal, binary, or
decimal), or "?" symbol for variables that are not initialized.
As you probably know from part 2 of this tutorial, MOV instruction is used to copy values from source
to destination.
DATA SEGMENT
VAR1 DB 7
var2 DW 1234h
ENDP
• As you see this looks a lot like our example, except that variables are replaced with actual
memory locations. When compiler makes machine code, it automatically replaces all variable
names with their offsets. By default segment is loaded in DS register (when COM files is
loaded the value of DS register is set to the same value as CS register - code segment).
• In memory list first row is an offset, second row is a hexadecimal value, third row is decimal
value, and last row is an ASCII character value.
• Compiler is not case sensitive, so "VAR1" and "var1" refer to the same variable.
• The offset of VAR1 is 0108h, and full address is 0B56:0108.
• The offset of var2 is 0109h, and full address is 0B56:0109, this variable is a WORD so it
occupies 2 BYTES. It is assumed that low byte is stored at lower address, so 34h is located
before 12h. Eng A4KT 2015 5
Arrays
Arrays can be seen as chains of variables. A text string is an example of a byte array, each
character is presented as an ASCII code value (0..255).
Here are some array definition examples:
a DB 48h, 65h, 6Ch, 6Ch, 6Fh, 00h
b DB 'Hello', 0
b is an exact copy of the a array, when compiler sees a string inside quotes it automatically
converts it to set of bytes. This chart shows a part of the memory where these arrays are
declared:
for example:
c DB 5 DUP(9)
is an alternative way of declaring: c DB 9,9,9,9,9
one more example: d DB 5 DUP(1,2)
is an alternative way of declaring: d DB 1,2,1,2,1,2,1,2,1,2
Of course, you can use DW instead of DB if it's required to keep values larger then 255, or
smaller then -128. DW cannot be used to declare strings.
There is LEA (Load Effective Address) instruction and alternative OFFSET operator. Both OFFSET and
LEA can be used to get the offset address of the variable.
LEA is more powerful because it also allows you to get the address of an indexed variables. Getting the
address of the variable can be very useful in some situations, for example when you need to pass
parameters to a procedure.
Reminder:
In order to tell the compiler about data type, these prefixes should be used:
BYTE PTR - for byte.
WORD PTR - for word (two bytes).
For example:
BYTE PTR [BX] ; byte access.
or
WORD PTR [BX] ; word access.
assembler supports shorter prefixes as well:
b. - for BYTE PTR
w. - for WORD PTR
in certain cases the assembler can calculate the data type automatically.
Constants
Constants are just like variables, but they exist only until your program is compiled (assembled). After
definition of a constant its value cannot be changed. To define constants EQU directive is used:
name EQU < any expression >
For example:
k EQU 5
MOV AX, k
The above example is functionally identical to code:
MOV AX, 5
CODE SEGMENT
LEA SI, msg1 ; ask for the number
CALL print_string ;
CALL scan_num ; get number in CX.
MOV AX, CX ; copy the number to AX.
; print the following string:
CALL pthis DB 13, 10, 'You have entered: ', 0
CALL print_num ; print number in AX.
HLT
ENDS
RET
name ENDP
name - is the procedure name, the same name should be in the top and the bottom, this is used to
check correct closing of procedures.
Probably, you already know that RET instruction is used to return to operating system. The same
instruction is used to return from procedure (actually operating system sees your program as a special
procedure).
PROC and ENDP are compiler directives, so they are not assembled into any real machine code.
Compiler just remembers the address of procedure.
CALL instruction is used to call a procedure.
CODE SEGMENT
CALL m1
MOV AX, 2
HLT ; return to operating system.
m1 PROC
MOV BX, 5
RET ; return to caller.
m1 ENDP
The above example calls procedure m1, does MOV BX, 5, and returns to the next instruction after
CALL: MOV AX, 2.
CODE SEGMENT
MOV AL,1
MOV BL,2
CALL m2
CALL m2
CALL m2
CALL m2
HLT ; return to operating system.
m2 PROC
MUL BL ; AX = AL * BL.
RET ; return to caller.
m2 ENDP
ENDS
In the above example value of AL register is update every time the procedure is called, BL register
stays unchanged, so this algorithm calculates 2 in power of 4,
so final result in AX register is 16 (or 10h).
Stack is an area of memory for keeping temporary data. Stack is used by CALL instruction to keep return
address for procedure, RET instruction gets this value from the stack and returns to that offset. Quite the
same thing happens when INT instruction calls an interrupt, it stores in stack flag register, code segment
and offset. IRET instruction is used to return from interrupt call.
We can also use the stack to keep any other data, there are two instructions that work with the stack:
PUSH - stores 16 bit value in the stack.
POP - gets 16 bit value from the stack.
Notes:
PUSH and POP work with 16 bit values only!
Note: PUSH immediate works only on 80186 CPU and later!
The stack uses LIFO (Last In First Out) algorithm,
this means that if we push these values one by one into the stack:
1, 2, 3, 4, 5
the first value that we will get on pop will be 5, then 4, 3, 2, and only then 1.
The exchange happens because stack uses LIFO (Last In First Out) algorithm, so when we
push 1212h and then 3434h, on pop we will first get 3434h and only after it 1212h.
Eng A4KT 2015 24
The stack memory area is set by SS (Stack Segment) register, and SP (Stack Pointer) register. Generally
operating system sets values of these registers on program start.
"PUSH source" instruction does the following:
• Subtract 2 from SP register.
• Write the value of source to the address SS:SP.
"POP destination" instruction does the following:
• Write the value at the address SS:SP to destination.
• Add 2 to SP register.
• The current address pointed by SS:SP is called the top of the stack.
• For COM files stack segment is generally the code segment, and stack pointer is set to value of
0FFFEh. At the address SS:0FFFEh stored a return address for RET instruction that is executed in the
end of the program.
• You can visually see the stack operation by clicking on [Stack] button on emulator window. The top of
the stack is marked with "<" sign.