0% found this document useful (0 votes)
47 views26 pages

Assembly Language

This document provides an overview of x86 assembly language and architecture. It discusses the history of Intel processors from the 4004 in 1971 to the Pentium class in 1993. It also covers basic assembly concepts like registers, instructions, addressing modes, the stack, procedures, and the instruction set of various x86 processors. The document aims to introduce assembly language fundamentals in a brief and high-level manner.

Uploaded by

Alistar Andreea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views26 pages

Assembly Language

This document provides an overview of x86 assembly language and architecture. It discusses the history of Intel processors from the 4004 in 1971 to the Pentium class in 1993. It also covers basic assembly concepts like registers, instructions, addressing modes, the stack, procedures, and the instruction set of various x86 processors. The document aims to introduce assembly language fundamentals in a brief and high-level manner.

Uploaded by

Alistar Andreea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 26

Assembly Language

Intel and AMD 32-bit


Architecture (x86)

Things I dont intend to cover


Yeahsorry, folks, dont have a lot of time.
Privileged instructions
Standalone source files and PWB
Vector instructions (MMX, SSE[2], 3DNow!)
Instruction encodings
How to write code for processors prior to 386

A Brief History of VLSI


4004 (71), 8008 (72)
8086 (78), 8088 (79)
80186/88 (82)
80286 (82), 80386 (85)
80486 (89)
Pentium class/586 (93)

The Daily Register


8/16
bits
32 bits
EAX
AH
| AL
EBX
BH
| BL
ECX
CH
| CL
EDX
DH
| DL
ESI
SI
EDI
DI
ESP
SP
EBP BP

SS
CS
DS
ES
FS
GS
EIP
IP
EFLAGSFLAGS

Moving On

mov
mov
mov
mov
mov
mov
mov

<dest>, <src>
eax, dwMyVar
eax, 65h
eax, 0FFFFFFFFh
eax, [ebx]
eax, [eax+4]
dwMyVar, esi

The Meaning of Brackets


On a variable, brackets have no effect
mov
eax, [dwMyVar]
On a register, brackets dereference a pointer
mov
eax, [eax]
A displacement can be indicated in two ways
mov
eax, [eax+8]
mov eax, [eax]8
There are more things that can be done with brackets
which Ill illustrate when we get to the instruction LEA
(Load Effective Address)

rithmetic
add eax, ebx
sub eax, ebx
mul edx
imul edx
inc
eax
dec eax
adc, sbb, neg

eax += ebx;
eax -= ebx;
eax *= edx;
(signed version)
eax++;
eax--;

A House Divided

[i]div <divisor>
Dividend Divisor
AX
8 bits
DX:AX 16 bits
EDX:EAX 32 bits

Quotient
AL
AX
EAX

Remainder
AH
DX
EDX

A Lil Bit of Bit Manipulation

and
or
xor
not

or
jz

eax, ebx
eax, 3
ecx, 69h
ebx
ah,ah
lbl_AHIsZero

eax&=ebx;
eax|=3;
ecx^=0x69;
ebx=~ebx;

Shifting Things Around

shl/sal eax, 8
eax<<=8;
shr
eax, 6
eax>>=6;
sar
ecx, 7
replicate sign bit
rol
esi, 11
esi=(esi>>21)|(esi<<11)
ror
esi, 21
esi=(esi>>21)|(esi<<11)
rcl, rcr rotate through CF
shl
eax, cl
eax<<=cl;

Being Effective
lea
eax, MyPtr
(mov
eax, OFFSET MyPtr)
lea
edi, [ebx+edi]
lea
eax, [esp+10]
lea
ecx, [eax*2+eax+6]
lea
eax, MyPtr[eax+4][esi*2]
[base[*scale]][+displacement][+index]

Sizing Things Up

movzx/movsx eax, bh
mov
ax, WORD PTR [MyPtr+6]
inc
BYTE PTR [eax]
cbw
(al->ax)
cwd,cwde
(ax->dx:ax, ax->eax)
cdq
(eax->edx:eax)

Flags

sub,and cmp,test ; just without changing dest


There are dozens of flags; you only need to know a few.
Carry
if theres a carry or borrow
Parity
if low-order bits have even parity
Zero
if result is zero
Sign
if result is negative
Overflow
if result is too large or small
Direction
string operations should go down

Getting Around
Unconditional:
JMP dest
Conditional (165) :
JCXZ, JECXZ, LOOP
JC/JB/JNAE, JNC/JNB/JAE, JBE/JNA, JA/JNBE
JE/JZ, JNE/JNZ, JS, JNS
JL/JNGE, JGE/JNL, JLE/JNG, JG/JNLE
JO, JNO, JP/JPE, JNP/JPO
Interrupts:
int 2Eh
into

Addressing Modes

Segment overrides and related issues will be ignored


Register:
eax, ecx, ebp
Immediate:
5, 0x78
Direct memory:
MyVar, [MyVar+2]
Indirect memory:
[eax], [eax+esi+7]
Direct:
jmp label
Register Indirect:
jmp ebx
Memory Indirect:
jmp [ebx]
Relative:
jmp short $+2

Stacking Up
esp, ebp, ss are used to reference the stack
esp points to the top of the stack (last pushed value), while
ebp points to whatever you want, but usually the frame
pointer
The stack grows downwards in memory
The call instruction automatically pushes the return address
ret alone pops the return address and jumps to it
ret with an immediate operand also pops X bytes of
arguments

The Stack Continues to Grow


push and pop perform the typical ADT operations
In 32-bit code, push and pop always change esp by 4 bytes,
regardless of the size of the operand.
pushfd and popfd will push and pop the eflags register; this
is very useful for directly manipulating flags
(you can use lahf and sahf to transfer directly between AH
and the low byte of eflags, if thats all you want)
pushad and popad will save and restore the 8 GP registers
The stack can be used to effectively mov between segment
registers

Calling Conventions
Today, arguments are almost universally pushed last-argument-first;
this accommodates varargs. (If you remember Windows 3.1, the
PASCAL calling convention was first-argument-pushed-first.)
Return values are in eax for most data types
_stdcall and _thiscall (except with varargs) let the called function
clean up the stack at the end of a call
_cdecl lets the caller clean up the stack after a function call returns
_fastcall is something thats used to mimic the speed of pure assembly
programs, and therefore is generally irrelevant to real assembly
programs. I dont have any details on it.
All calling conventions engage in some degree of name-mangling
when going from source code to object code.

Prologue and Epilogue

Typical prologue:
push ebp
mov ebp,esp
sub esp,LOCALSIZE
Typical epilogue:
pop ebp
ret
<or> ret x, where x is an immediate specifying bytes to pop
In MS VC++, you can tell the compiler to omit prologue and epilogue code
(almost always because you want to write it yourself in assembly) by
specifying the attribute _declspec(_naked)
Generally, temporary registers are saved and restored in these areas too
If you omit the frame pointer, a lot of this goes away
SEH adds a bunch of additional lines, but Im still researching it.

String Instructions

stosb/stosw/stosd stores identical data to a buffer


cmps{b/w/d} compares two buffers
scas{b/w/d} scans a buffer for a particular byte
movs{b/w/d} copies a buffer
ins{b/w/d} and outs{b/w/d} involve I/O ports and are only listed here because
theyre considered string instructions
lods{b/w/d} loads data from memory
All string instructions except lods* can, and usually are, used with repeat
prefixes.
The direction flag determines which way the pointers are moved.
edi is always the destination pointer and esi is always the source pointer
eax/ax/al are used with stos*, lods*, and scas* for single data items
flags can be set by cmps*, of course

Prefixes
lock is useful for multiprocessor systems, but will not be
discussed here.
rep* is generally used with string instructions, to repeat an
instruction a maximum of ecx times
rep is unconditional
repe/repz and repnz/repne are conditional, based, of course,
on the zero flag
stos*, movs*, ins*, and outs* can use unconditional repeats
scas* and cmps* can use conditional repeats

Instruction Set 8086/88


AAA
CBW
CWD
IMUL
JB JBE
JLEJMP
JNG
JO JP
LEA
LOOPNZ
NOT
RCR
ROR
SHR
XCHG

AAD
CLC
DAA
IN
JC
JNA
JNGE
JPE
LES
LOOPZ
OR
REP
SAHF
STC
XLAT

AAM
CLD
DAS
INC
JCXZ
JNAE
JNL
JPO
LOCK
MOV
OUT
REPE
SAL
STD
XOR

AAS
CLI
DEC
INT
JE
JNB
JNLE
JS
LODSB
MOVSB
POP
REPNE
SAR
STOSB

ADC
CMC
DIV
INTO
JG
JNBE
JNO
JZ
LODSW
MOVSW
POPF
REPNZ
SBB
STOSW

ADD
CMP
ESC
IRET
JGE
JNC
JNP
LAHF
LOOP
MUL
PUSH
REPZ
SCASB
SUB

AND
CMPSB
HLT
JA
JL
JNE
JNS
LDS
LOOPE
NEG
PUSHF
RET
SCASW
TEST

CALL
CMPSW
IDIV
JAE

JNZ
LOOPNE
NOP
RCL
ROL
SHL
WAIT

Instruction Set (p. 2)


80186/88:
BOUND ENTER
OUTSW POPA

INS
PUSHA

INSB

INSW

LEAVE

OUTS

OUTSB

LAR
SIDT

LGDT
SLDT

LIDT
SMSW

LLDT
STR

LMSW
VERR

LSL
VERW

80286:
ARPL
LTR

CLTS
SGDT

Instruction Set 80386


BSF
CWDE
MOVSX
SETA
SETL
SETNG
SETO
STOSD

BSR
INSD
MOVZX
SETAE
SETLE
SETNGE
SETP

BT
JECXZ
OUTSD
SETB
SETNA
SETNL
SETPE

BTC
LFS
POPAD
SETBE
SETNAE
SETNLE
SETPO

BTR
LGS
POPFD
SETC
SETNB
SETNO
SETS

BTS
LODSD
PUSHAD
SETE
SETNBE
SETNP
SETZ

CDQ
LSS
PUSHFD
SETG
SETNC
SETNS
SHLD

CMPSD
MOVSD
SCASD
SETGE
SETNE
SETNZ
SHRD

Instruction Set (p. 4)


80486:
BSWAP

CMPXCHG

INVD

INVLPG WBINVD XADD

RDMSR

RDTSC

Pentium I:
CMPXCHG8B

CPUID

RSM

WRMSR

Other Stuff:
CLFLUSH
CMOV* CR0
CR2
CR3
CR4
DR0-7
LMXCSR LFENCE MFENCE PAUSE PREFETCH*
STMXCSR
SYSENTER
SYSEXIT UD2

SFENCE

The Road Ahead

Floating-point instructions
Vector instructions
Standalone assembly file directives?
Structured exception handling?
Disassembly techniques?

You might also like