0% found this document useful (0 votes)
9 views

Chapter 3.3 i8086 Programming

Chapter 3.3 covers programming in i8086 assembly language, detailing the structure of assembly statements, variable usage, and array definitions. It introduces key concepts such as segments, constants, macros, and procedures, along with examples demonstrating their application. The chapter also includes a library of common functions to simplify programming tasks.

Uploaded by

rohobotkolaso787
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Chapter 3.3 i8086 Programming

Chapter 3.3 covers programming in i8086 assembly language, detailing the structure of assembly statements, variable usage, and array definitions. It introduces key concepts such as segments, constants, macros, and procedures, along with examples demonstrating their application. The chapter also includes a library of common functions to simplify programming tasks.

Uploaded by

rohobotkolaso787
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CHAPTER 3.

3
Programming i8086
 Assembly language statement source format  Symbol types
 The location counter  Address expressions (later subsections contain
 Symbols and identifiers advanced material)
 Constants  Conditional assembly
 Procedure declarations  Macros
 Segments in an assembly language program  Listing directives
 Variables  Separate assembly
Eng A4KT 2015 1
116-Instructions of i8086 in alphabetic order

AAA DIV JMP JZ OUT SAR


AAD HLT JNA LAHF POP SBB
AAM IDIV JNAE LDS POPA SCASB
AAS IMUL JNB LEA POPF SCASW
ADC IN JNBE LES PUSH SHL
ADD INC JNC LODSB PUSHA SHR
AND INT JNE LODSW PUSHF STC
CALL INTO JNG LOOP RCL STD
CBW IRET JNGE LOOPE RCR STI
CLC JA JNL LOOPNE REP STOSB
CLD JAE JNLE LOOPNZ REPE STOSW
CLI JB JNO LOOPZ REPNE SUB
CMC JBE JNP MOV REPNZ TEST
CMP JC JNS MOVSB REPZ XCHG
CMPSB JCXZ JNZ MOVSW RET XLATB
CMPSW JE JO MUL RETF XOR
CWD JG JP NEG ROL
DAA JGE JPE NOP ROR
DAS JL JPO NOT SAHF
DEC JLE JS OR SAL
Eng A4KT 2015 2
The program structure: This is assembly language statement source format
Assembly language statements in a source file use the following format:
;Assembler directives here
Label: instruction ;Comment
;end of code here
• Instruction has its own format that has been discussed in the earlier chapter.
• Comment section is separated from the instruction by ; ‘semicolon’
• Label is separated from the instruction field by : ‘colon’.
• Label can be any symbol to identify program line, or specific location in a program.
• In a program is possible to use multiple labels.
• No two labels have the same name/symbol.
• Some times they are called address symbol.
• The general structure of the program is divided in to sections called segments that the program uses.
• Data segment it is a section you store data
• Code segment it is a section where you can write a program code
• Stack segment is used when stack implementation is required
• Extra segment is a segment used when working with strings
• Assembler directives are special instructions that provide information to the assembler but do not
generate any code. Examples include the segment directive, equ, assume, and end. A pseudo-opcode is
a message to the assembler, just like an assembler directive, however a pseudo-opcode will emit object
code bytes. Examples of pseudo-opcodes include byte, word, dword, Qword and Tbyte. These
instructions emit the bytes of data specified by their operands but they are not true 80X86 machine
instructions. Eng A4KT 2015 3
How to use variables

Variable is a memory location. For a programmer it is much easier to have some value be kept in a
variable named "var1" then at the address 5A73:235B, especially when you have 10 or more variables.
Our compiler supports two types of variables: BYTE and WORD.
Syntax for a variable declaration:
name DB value
name DW value
DB - stays for Define Byte.
DW - stays for Define Word.
name - can be any letter or digit combination, though it should start with a letter. It's possible to
declare unnamed variables by not specifying the name (this variable will have an address but no
name).
value - can be any numeric value in any supported numbering system (hexadecimal, binary, or
decimal), or "?" symbol for variables that are not initialized.
As you probably know from part 2 of this tutorial, MOV instruction is used to copy values from source
to destination.

Eng A4KT 2015 4


Let's see another example with MOV instruction:
CODE SEGMENT
MOV AL, var1
MOV BX, var2
HLT ; stops the program.

DATA SEGMENT
VAR1 DB 7
var2 DW 1234h
ENDP

• As you see this looks a lot like our example, except that variables are replaced with actual
memory locations. When compiler makes machine code, it automatically replaces all variable
names with their offsets. By default segment is loaded in DS register (when COM files is
loaded the value of DS register is set to the same value as CS register - code segment).
• In memory list first row is an offset, second row is a hexadecimal value, third row is decimal
value, and last row is an ASCII character value.
• Compiler is not case sensitive, so "VAR1" and "var1" refer to the same variable.
• The offset of VAR1 is 0108h, and full address is 0B56:0108.
• The offset of var2 is 0109h, and full address is 0B56:0109, this variable is a WORD so it
occupies 2 BYTES. It is assumed that low byte is stored at lower address, so 34h is located
before 12h. Eng A4KT 2015 5
Arrays
Arrays can be seen as chains of variables. A text string is an example of a byte array, each
character is presented as an ASCII code value (0..255).
Here are some array definition examples:
a DB 48h, 65h, 6Ch, 6Ch, 6Fh, 00h
b DB 'Hello', 0

b is an exact copy of the a array, when compiler sees a string inside quotes it automatically
converts it to set of bytes. This chart shows a part of the memory where these arrays are
declared:

Eng A4KT 2015 6


You can access the value of any element in array using square brackets, for example:
MOV AL,a[3]
You can also use any of the memory index registers BX, SI, DI, BP, for example:
MOV SI,3
MOV AL,a[SI]
If you need to declare a large array you can use DUP operator.

The syntax for DUP:


number DUP (value(s))
number - number of duplicate to make (any constant value).
value - expression that DUP will duplicate.

for example:
c DB 5 DUP(9)
is an alternative way of declaring: c DB 9,9,9,9,9
one more example: d DB 5 DUP(1,2)
is an alternative way of declaring: d DB 1,2,1,2,1,2,1,2,1,2
Of course, you can use DW instead of DB if it's required to keep values larger then 255, or
smaller then -128. DW cannot be used to declare strings.

Eng A4KT 2015 7


Getting the Address of a Variable

There is LEA (Load Effective Address) instruction and alternative OFFSET operator. Both OFFSET and
LEA can be used to get the offset address of the variable.
LEA is more powerful because it also allows you to get the address of an indexed variables. Getting the
address of the variable can be very useful in some situations, for example when you need to pass
parameters to a procedure.

Reminder:
In order to tell the compiler about data type, these prefixes should be used:
BYTE PTR - for byte.
WORD PTR - for word (two bytes).
For example:
BYTE PTR [BX] ; byte access.
or
WORD PTR [BX] ; word access.
assembler supports shorter prefixes as well:
b. - for BYTE PTR
w. - for WORD PTR
in certain cases the assembler can calculate the data type automatically.

Eng A4KT 2015 8


Here is first example:
CODE SEGMENT
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
LEA BX, VAR1 ; put address of VAR1 in BX.
MOV BYTE PTR [BX], 44h ; modify the contents of VAR1.
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
HLT
ENDP
DATA SEGMENT
VAR1 DB 22h
ENDP
Here is another example, that uses OFFSET instead of LEA:
CODE SEGMENT
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
MOV BX, OFFSET VAR1 ; put address of VAR1 in BX.
MOV BYTE PTR [BX], 44h ; modify the contents of VAR1.
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
HLT
ENDP
DATA SEGMENT
VAR1 DB 22h
ENDP

Eng A4KT 2015 9


Both examples have the same functionality.
These lines:
LEA BX, VAR1
MOV BX, OFFSET VAR1
are even compiled into the same machine code: MOV BX, num
num is a 16 bit value of the variable offset.
Please note that only these registers can be used inside square brackets (as memory pointers): BX, SI,
DI, BP!
(see previous part of the tutorial).

Constants

Constants are just like variables, but they exist only until your program is compiled (assembled). After
definition of a constant its value cannot be changed. To define constants EQU directive is used:
name EQU < any expression >
For example:
k EQU 5
MOV AX, k
The above example is functionally identical to code:
MOV AX, 5

Eng A4KT 2015 10


Library of common functions: emu8086.inc 8086 assembler
• To make programming easier there are some common functions that can be included in your
program.
• To make your program use functions defined in other file you should use the INCLUDE directive
followed by a file name.
• Compiler automatically searches for the file in the same folder where the source file is located,
and if it cannot find the file there - it searches in Inc folder.
• Currently you may not be able to fully understand the contents of the emu8086.inc (located in Inc
folder), but it's OK, since you only need to understand what it can do.
• To use any of the functions in emu8086.inc you should have the following line in the beginning of
your source file:
include 'emu8086.inc'

Macros: emu8086.inc defines the following macros:


• PRINT string - macro with 1 parameter, prints out a string.
• PRINTN string - macro with 1 parameter, prints out a string. The same as PRINT but
automatically adds "carriage return" at the end of the string.
• PUTC char - macro with 1 parameter, prints out an ASCII char at current cursor position.
• GOTOXY col, row - macro with 2 parameters, sets cursor position.
• CURSOROFF - turns off the text cursor.
• CURSORON - turns on the text cursor.
Eng A4KT 2015 11
To use any of the above macros simply type its name somewhere in your code, and if required parameters,
for example:
include emu8086.inc
CODE SEGMENT
START:
PRINT 'Hello World!'
GOTOXY 10, 5
PUTC 65 ; 65 - is an ASCII code for 'A'
PUTC 'B’
HLT ; return to operating system.
ENDS ; directive to stop the compiler.
When compiler process your source code it searches the emu8086.inc file for declarations of the macros
and replaces the macro names with real code. Generally macros are relatively small parts of code, frequent
use of a macro may make your executable too big (procedures are better for size optimization).
emu8086.inc also defines the following procedures:
• PRINT_NUM - procedure that prints a signed number in AX register. To use it declare:
DEFINE_PRINT_NUM and DEFINE_PRINT_NUM_UNS before END directive.
• PRINT_NUM_UNS - procedure that prints out an unsigned number in AX register. To use it declare:
DEFINE_PRINT_NUM_UNS before END directive.
• GET_STRING - procedure to get a null terminated string from a user, the received string is written to
buffer at DS:DI, buffer size should be in DX. Procedure stops the input when 'Enter' is pressed. To use
it declare: DEFINE_GET_STRING before END directive.
• PRINT_STRING - procedure to print a null terminated string at current cursor position, receives
address of string in DS:SI register. To use it declare: DEFINE_PRINT_STRING before END directive.
Eng A4KT 2015 12
To use any of the above procedures you should first declare the function in the bottom of your file
(but before the END directive), and then use CALL instruction followed by a procedure name. For
example:
; demonstrate get_string and print_string
include 'emu8086.inc’
DATA SEGMENT
msg1 DB "Enter your name: ", 0
newln DB 13, 10, "Hello$ "
buffer DB 20 DUP (0) ; input buffer for get_string
bufSize = $-buffer ; calculates size of buffer
DEFINE_GET_STRING
DEFINE_PRINT_STRING
ENDS
CODE SEGMENT
LEA SI, msg1 ; set up pointer (SI) to msg to ask for the number
CALL print_string ; print message that SI points to
LEA DI, buffer ; set up pointer (DI) to input buffer
MOV DX, bufSize ; set size of buffer
CALL get_string ; get name & put in buffer
LEA SI, newln ; point at CR/LF / Hello message
CALL print_string ; print message that SI points to
HLT
ENDS
Eng A4KT 2015 13
• CLEAR_SCREEN - procedure to clear the screen, (done by scrolling entire screen window), and set
cursor position to top of it. To use it declare: DEFINE_CLEAR_SCREEN before END directive.
• PTHIS - procedure to print a null terminated string at current cursor position (just as
PRINT_STRING). The ZERO TERMINATED string should be defined just after the CALL. For
example:
CALL PTHIS
db 'Hello World!’, 0
• Address of string is stored in the Stack as return address. Procedure updates value in the Stack to
make return after string definition. To use it declare: DEFINE_PTHIS before END directive.
• SCAN_NUM - procedure that gets the multi-digit SIGNED number from the keyboard, and stores
the result in CX register. To use it declare: DEFINE_SCAN_NUM before END directive.

Eng A4KT 2015 14


; demonstrate scan_num, print_num, pthis
include 'emu8086.inc’
DATA SEGMENT
msg1 DB 'Enter the number: ', 0
; macros to define procs
DEFINE_SCAN_NUM
DEFINE_PRINT_STRING
DEFINE_PRINT_NUM
DEFINE_PRINT_NUM_UNS ; required for print_num.
DEFINE_PTHIS
ENDS

CODE SEGMENT
LEA SI, msg1 ; ask for the number
CALL print_string ;
CALL scan_num ; get number in CX.
MOV AX, CX ; copy the number to AX.
; print the following string:
CALL pthis DB 13, 10, 'You have entered: ', 0
CALL print_num ; print number in AX.
HLT
ENDS

Eng A4KT 2015 15


• First, the compiler processes the declarations (these are just regular the macros that are expanded to
procedures).
• When compiler gets to CALL instruction it replaces the procedure name with the address of the code
where the procedure is declared.
• When CALL instruction is executed control is transferred to procedure.
• This is quite useful, since even if you call the same procedure 100 times in your code you will still have
relatively small executable size.
• Seems complicated, isn't it? That's ok, with the time you will learn more, currently it's required that
you understand the basic principle.

Eng A4KT 2015 16


Procedures
Procedure is a part of code that can be called from your program in order to make some specific task.
Procedures make program more structural and easier to understand. Generally procedure returns to
the same point from where it was called.
The syntax for procedure declaration:
name PROC

; here goes the code


; of the procedure ...

RET
name ENDP

name - is the procedure name, the same name should be in the top and the bottom, this is used to
check correct closing of procedures.
Probably, you already know that RET instruction is used to return to operating system. The same
instruction is used to return from procedure (actually operating system sees your program as a special
procedure).
PROC and ENDP are compiler directives, so they are not assembled into any real machine code.
Compiler just remembers the address of procedure.
CALL instruction is used to call a procedure.

Eng A4KT 2015 17


Here is an example:

CODE SEGMENT
CALL m1
MOV AX, 2
HLT ; return to operating system.

m1 PROC
MOV BX, 5
RET ; return to caller.
m1 ENDP

The above example calls procedure m1, does MOV BX, 5, and returns to the next instruction after
CALL: MOV AX, 2.

Eng A4KT 2015 18


There are several ways to pass parameters to procedure, the easiest way to pass parameters is by
using registers, here is another example of a procedure that receives two parameters in AL and BL
registers, multiplies these parameters and returns the result in AX register:

CODE SEGMENT
MOV AL,1
MOV BL,2
CALL m2
CALL m2
CALL m2
CALL m2
HLT ; return to operating system.
m2 PROC
MUL BL ; AX = AL * BL.
RET ; return to caller.
m2 ENDP
ENDS
In the above example value of AL register is update every time the procedure is called, BL register
stays unchanged, so this algorithm calculates 2 in power of 4,
so final result in AX register is 16 (or 10h).

Eng A4KT 2015 19


Here goes another example, that uses a procedure to print a Hello World! message:
DATA SEGMENT
msg DB 'Hello World!', 0 ; null terminated string.
CODE SEGMENT
LEA SI, msg ; load address of msg to SI.
CALL print_me
HLT ; return to operating system.
; this procedure prints a string, the string should be null
; terminated (have zero in the end),
; the string address should be in SI register:
print_me PROC
next_char:
CMP b.[SI], 0 ; check for zero to stop
JE stop ;
MOV AL, [SI] ; next get ASCII char.
MOV AH, 0Eh ; teletype function number.
INT 10h ; using interrupt to print a char in AL.
ADD SI, 1 ; advance index of string array.
JMP next_char ; go back, and type another char.
stop: "b." - prefix before [SI] means that we
RET ; return to caller. need to compare bytes, not words.
ENDP When you need to compare words add
"w." prefix instead. When one of the
compared operands is a register it's not
required because compiler knows the
Eng A4KT 2015 size of each register. 20
The Stack

Stack is an area of memory for keeping temporary data. Stack is used by CALL instruction to keep return
address for procedure, RET instruction gets this value from the stack and returns to that offset. Quite the
same thing happens when INT instruction calls an interrupt, it stores in stack flag register, code segment
and offset. IRET instruction is used to return from interrupt call.

We can also use the stack to keep any other data, there are two instructions that work with the stack:
PUSH - stores 16 bit value in the stack.
POP - gets 16 bit value from the stack.

Syntax for PUSH instruction:


PUSH REG
PUSH SREG
PUSH memory
PUSH immediate
REG: AX, BX, CX, DX, DI, SI, BP, SP.
SREG: DS, ES, SS, CS.
memory: [BX], [BX+SI+7], 16 bit variable, etc...
immediate: 5, -24, 3Fh, 10001101b, etc...

Eng A4KT 2015 21


Syntax for POP instruction:
POP REG
POP SREG
POP memory
REG: AX, BX, CX, DX, DI, SI, BP, SP.
SREG: DS, ES, SS, (except CS).
memory: [BX], [BX+SI+7], 16 bit variable, etc...

Notes:
PUSH and POP work with 16 bit values only!
Note: PUSH immediate works only on 80186 CPU and later!
The stack uses LIFO (Last In First Out) algorithm,
this means that if we push these values one by one into the stack:
1, 2, 3, 4, 5
the first value that we will get on pop will be 5, then 4, 3, 2, and only then 1.

Eng A4KT 2015 22


• It is very important to do equal number of PUSHs and POPs, otherwise the stack maybe corrupted
and it will be impossible to return to operating system. As you already know we use RET instruction to
return to operating system, so when program starts there is a return address in stack (generally it's
0000h).
• PUSH and POP instruction are especially useful because we don't have too much registers to operate
with, so here is a trick:
• Store original value of the register in stack (using PUSH).
• Use the register for any purpose.
• Restore the original value of the register from stack (using POP).
Eng A4KT 2015 23
Here is an example:
CODE SEGMENT
MOV AX, 1234h
PUSH AX ; store value of AX in stack.
MOV AX, 5678h ; modify the AX value.
POP AX ; restore the original value of AX.
HLT
ENDP
Another use of the stack is for exchanging the values, here is an example:
CODE SEGMENT
MOV AX, 1212h ; store 1212h in AX.
MOV BX, 3434h ; store 3434h in BX
PUSH AX ; store value of AX in stack.
PUSH BX ; store value of BX in stack.
POP AX ; set AX to original value of BX.
POP BX ; set BX to original value of AX.
HLT
ENDP

The exchange happens because stack uses LIFO (Last In First Out) algorithm, so when we
push 1212h and then 3434h, on pop we will first get 3434h and only after it 1212h.
Eng A4KT 2015 24
The stack memory area is set by SS (Stack Segment) register, and SP (Stack Pointer) register. Generally
operating system sets values of these registers on program start.
"PUSH source" instruction does the following:
• Subtract 2 from SP register.
• Write the value of source to the address SS:SP.
"POP destination" instruction does the following:
• Write the value at the address SS:SP to destination.
• Add 2 to SP register.
• The current address pointed by SS:SP is called the top of the stack.
• For COM files stack segment is generally the code segment, and stack pointer is set to value of
0FFFEh. At the address SS:0FFFEh stored a return address for RET instruction that is executed in the
end of the program.
• You can visually see the stack operation by clicking on [Stack] button on emulator window. The top of
the stack is marked with "<" sign.

Eng A4KT 2015 25


Macros
Macros are just like procedures, but not really. Macros look like procedures, but they exist only until your
code is compiled, after compilation all macros are replaced with real instructions. If you declared a
macro and never used it in your code, compiler will simply ignore it. emu8086.inc is a good example of
how macros can be used, this file contains several macros to make coding easier for you.
Macro definition:
name MACRO [parameters,...]
<instructions>
ENDM
Unlike procedures, macros should be defined above the code that uses it, for example:

MyMacro MACRO p1, p2, p3


MOV AX, p1
MOV BX, p2
The above code is expanded into:
MOV CX, p3
MOV AX, 00001h
ENDM
MOV BX, 00002h
CODE SEGMENT
MOV CX, 00003h
MyMacro 1, 2, 3
MOV AX, 00004h
MyMacro 4, 5, DX
MOV BX, 00005h
HLT
MOV CX, DX
ENDP

Eng A4KT 2015 26


Some important facts about macros and procedures:
• When you want to use a procedure you should use CALL instruction, for example:
• CALL MyProc
• When you want to use a macro, you can just type its name. For example:
• MyMacro
• Procedure is located at some specific address in memory, and if you use the same procedure 100
times, the CPU will transfer control to this part of the memory. The control will be returned back to the
program by RET instruction. The stack is used to keep the return address. The CALL instruction takes
about 3 bytes, so the size of the output executable file grows very insignificantly, no matter how many
time the procedure is used.
• Macro is expanded directly in program's code. So if you use the same macro 100 times, the compiler
expands the macro 100 times, making the output executable file larger and larger, each time all
instructions of a macro are inserted.
• You should use stack or any general purpose registers to pass parameters to procedure.
• To pass parameters to macro, you can just type them after the macro name. For example:
• MyMacro 1, 2, 3
• To mark the end of the macro ENDM directive is enough.
• To mark the end of the procedure, you should type the name of the procedure before the ENDP
directive.

Eng A4KT 2015 27


Macros are expanded directly in code, therefore if there are labels inside the macro definition
you may get "Duplicate declaration" error when macro is used for twice or more. To avoid such
problem, use LOCAL directive followed by names of variables, labels or procedure names. For
example:
MyMacro2 MACRO
LOCAL label1, label2
CMP AX, 2
JE label1
CMP AX, 3
JE label2
label1:
INC AX
label2:
ADD AX, 2
ENDM
CODE SEGMENT
MyMacro2
MyMacro2
HLT
ENDP
If you plan to use your macros in several programs, it may be a good idea to place all macros in a
separate file. Place that file in Inc folder and use INCLUDE file-name directive to use macros.
Eng A4KT 2015 28

You might also like