Assembly Language Introduction

Assembly Language Programming
Inside the Central Processing Unit (CPU) all information are presented in binary form. In
order to utilize the computer we need to communicate with the CPU, however, binary
format is not convenient for human user. Therefore, computer engineers developed
different kinds of programming languages so that users can easily communicate with the
CPU by writing programs.
There are high level programming languages such as C, C++, Java, C#, Pascal
Assembly language is regarded as a low-level programming language because the
syntax used in assembly language is very close to the hardware itself. For example, if you
want to do a move/copy/assign operation, then you use an instruction called MOV that
comes from the operation move.
Advantages of using assembly programming language

1. The program can run faster because the program can fully utilize the CPU
2. The syntax is similar for different kinds of microprocessor so after learning the
assembly language for one processor you can adapt to other processors very
easily
3. Help users to understand the architecture of the microprocessor as well as how a
CPU performs its functions
Before you can write your program, you must know

1. the operations or instructions that are supported by the microprocessor (the
instruction set!). For example if there is no multiply operation in the instruction
then you can only use addition to carry out a multiply.
2. the functions of different registers, you need to utilize the registers during
programming
3. how external memory is organized and how it is addressed to obtain instructions
and data
If the above are available then you can program your microprocessor at once.
How to learn programming – the CLP model
• C –Concept
• L – Logic thinking
• P – Practice
• Concept – we must learn the basic syntax, such as how a program statement is
written
• Logic thinking – programming is a problem solving process so we must think
logically in order to derive a solution
• Practice – write more programs
Assembly language programming

The native language is machine language using 0,1 (binary) to represent an operation or
data. A single machine instruction can take up one or more bytes of code. Assembly
language is used to write the program using alphanumeric symbols (or mnemonic), eg
ADD, MOV, PUSH etc.
After you have written your program, the program will then be assembled (similar to
compiled) and linked into an executable program.
The executable program could be .com, .exe, .bin, .hex files
Assembly
Object file Executable file
program
xxx.asm
Assemble XXX.obj Link XXX.exe or .com
or .bin
Example of a very simple instruction
Consider a very simple instruction mov AL, 00H, it is to move a value 00 (HEX) to
the AL register of 8086.
Machine code for mov AL, 00H is B4 00 (2 bytes)
After assemble, the value B400 will be stored in the memory
When the program is being executed, then the value B400 is read from memory,
decoded and carried out the task.
When you write assembly language program, only put one instruction in each
statement. You cannot do things like:
A=(B+C)*100
To achieve the above, you may need to do a sequence of operations:
Move B to A (A=B)
Add C to A (A=A+C)
Multiply A with 100 (A=A*100)
Usually, the term statement is used to describe a line in an assembly language
program.
A statement must specify which operation (opcode) to be performed as well as the
operands
Eg ADD AX, BX
ADD is the operation
AX is called the destination operand
BX is called the source operand
The result is AX = AX + BX
General format for an assembly language statement
Label Instruction Comment
Start: Mov AX, BX ; copy BX into AX
The label (Start:) is only used to identify a location within your program. You do
not need to include a label in every statement!!!!!!
The word Start is the name of a label. A label’s name is defined by a user provided
that it is not a reserved word and do not put space between the name. After the name,
you must include the : to indicate that it is a label.
Comment is used to document the program so that user can understand the logic of
the program. Comment is identified by the ; put in front of it
Memory structure
The memory of a computer is organized in bytes. Each byte occupy one address.
So Even, or oddaddressed bytes of data can be independently accessed.
In a program, if you want to use an 8bit data, it is represented by directive db (b
byte) and 16bit by dw (wword).
To store a 16bit data, the MSB (Most Significant Byte) is stored at the higher byte
address and the LSB at the lower byte address.
For example a 16 bit value ABCD (Hex) will occupy two address locations, for
example 12344H and 12345H, then the low byte CD will be stored in 12344H and the
high byte AB will be stored in 12345H.
AB
CD
Lower address
Only 4 64Kbyte segments are active at the same time and these are: code, stack,
data, and extra
To access the active segments, it is via the segment register: CS (code), SS (stack),
DS (data), ES (extra). When you write your program, you sometimes need to properly
define the different segments.
STACK
Code
SS
CS
EXTRA
ES
DS
Segment registers DATA
Memory
To get an instruction: we need to generate an address for the instruction (code).
The address for an instruction is the sum of CS and IP.
The segment register points to the lowest addressed location in the current code
segment
CS + IP will give a 20bit address
IP will be incremented so it points to the next instruction (sometimes we use term
Program Counter (PC) instead of IP )
This is the concept of offset and base, do you still remember which is the offset and
which is the base?
Registers
In assembly language programming, you cannot operate on two memory locations in
the same instruction. So you usually need to store (move) value of one location into a
register and then perform your operation. After the operation, you then put the result back
to the memory location. Therefore, one form of operation that you will use very
frequently is the store (move) operation!!!
There are different categories of registers in a microprocessor. For 8086, the data
registers are most frequently used in programming.
Data Registers in 8086
There are four data registers: AX, BX, CX,and DX. All four registers are 16bit but they
can be used to store 2 8bit data. Then the name AH, AL, BH, BL, CH, CL, DH, DL are
used. H High, L – Low. Meaning that the 16bit register eg AX is divided into two 8bit
registers AH and AL.
The AX register is called the accumulator, usually used for storing result after an
operation.
Each of the 4 data registers can be used as the source or destination of an operation
during an arithmetic, logic, shift, or rotate operation. In some operations, the use of the
accumulator (AX) is assumed, eg in multiplication operation; details will be given in the
following.
Special use of the data registers
In based addressing mode, base register BX is used as a pointer to an operand in the
current data segment. Details given in the section on addressing modes.
The CX register is used as a counter in some instructions, eg. CL contains the count of
the number of bits by which the contents of the operand must be rotated or shifted by
multiplebit rotate. In a looping operation, content of CL represents the number of
iterations to be executed.
DX, data register, is used in all multiplication and division operations, it also contains an
input/output port address for some types of input/output operations.
Pointer and index registers
In addition to the data registers, there are the pointer and index registers, all 16bit. Some
pointer and index registers can be used as a general purpose register, ie can be used as an
operand in arithmetic or logic operations. However, most pointer and index registers have
special purposes.

The Stack (or the Stack segment) – is used as a temporary storage.
Data can be stored by the PUSH instruction and extracted by the POP instruction
To access the Stack, you must use the SP (Stack Pointer) and BP (Base Pointer), details
will be given in the section on stack.
The BP contains an offset address in the current stack segment. This offset address is
employed when using the based addressing mode and is commonly used by instructions
in a subroutine that reference parameters that were passed by using the stack
The Source index register (SI) and Destination index register (DI) are used to hold offset
addresses for use in indexed addressing (similar to a pointer in C++ programming) of
operands in memory. When indexed type of addressing is used, then SI refers to the
current data segment and DI refers to the current extra segment. Details can be found in
the section on addressing mode.
The index registers can also be used as source or destination registers in arithmetic and
logical operations. But must be used in 16bit mode.
Data types
In 8086 assembly language, the data types are simple, only 8bit, 16bit, and 32bit (this
is called a double word). You cannot define data as integer, float or char, as in C++.
Integer could be signed or unsigned and in bytewide or wordwide format.
For a signed integer, the MSB can be used to determine the sign (0 for positive, 1 for
negative).
For example the value 1001 0100 is negative if it is a signed value
The range of Signed integer (8bit) is from 127 to –128,
For signed word (16bit) it is from 32767 to –32768
Latest microprocessors can also support 64bit or even 128bit data
Example of a simple Assembly Program
.code ; indicate start of code segment
.startup ; indicate beginning of a program
mov AX, 0
mov BX, 0000H ; sometimes we use #0000H
mov CX, 0
mov SI, AX
mov DI, AX
mov BP, AX
END ; end of file
The flow of the program is usually topdown and
instructions are executed one by one!!!
The above shows a very simple 8086 assembly program. You can see the basic syntax
used in the program and how the code segment is defined using the .code keyword.
The flow of the program is top-down, ie from start to end and only one statement is
executed at each time.
In general, an assembly program must include the code segment!!
Code segment stores the program codes.
Other segments, such as stack segment, data segment, are not compulsory.
There are key words used to indicate the beginning of a segment as
well as the end of a segment.
Example
DSEG segment ‘data’ ; define the start of a data segment
DSEG ENDS ; defines the end of a data segment
Segment is the keyword
DSEG is the name of the segment
Similarly key words are used to define the beginning of a program, as well as the end.
Example
CSEG segment ‘code’
START PROC FAR ; define the start of a program (procedure)
RET ; return
START ENDP ; define the end of a program
CSEG ends
End start
Different assembler may have different syntax for the defining
the key words !!!!!
Start Proc Far and RET are used to define the start and end of the main program.
Another example
Stacksg segment
…. ; define the stack segment
Stacksg ends
Datasg segment
…… ; declare data inside the data segment
Datasg ends
Codesg segment
Main proc far ;
assume ss:stacksg, ds: datasg, cs:codesg
mov ax, datasg
mov ds, ax
….
mov ax, 4c00H
int 21H
Main endp
Codesg ends
end main
Syntax for defining different components in a program:
To declare a segment, the syntax is:
segment_name SEGMENT
Example Stacksg segment
PROC – define procedures inside the code segment. Each procedure (function) must
be identified by an unique name. At the end of the procedure, you must include the
keyword ENDP .
FAR – is related to program execution. When you request execution of a program, the
program loader uses this procedure as the entry Point for the first instruction to execute.
Assume – to associate the name of a segment with a segment register
assume ss:stacksg, ds: datasg, cs:codesg
In the above, the SS register is associated with the stacksg segment (which is the stack).
In some assembler, you need to move the base address of a segment directly into the
segment register!!! Examples will be available in the following.
END – ends the entire program and appears as the last statement. Usually the name of the
first or only PROC designated as FAR is put after END
Fortunately, if you are doing something simple you do not need to include all the
segment declarations in the program. For example:
start:
mov DL, 0H ; move 0H to DL
mov CL, op1 ; move op1 to CL
mov AL, data ; move data to AL
step:
cmp AL, op1 ; compare AL and op1
jc label1 ; if carry =1 jump to label1
sub AL, op1 ; AL = AL –op1
inc DL ; DL = DL+1
jmp step ; jump to step
label1:
mov AH, DL ; move DL to AH
HLT ; Halt end of program
data db 45 ; just like a string

op1 db 6
Assembler for 8086 – to assmble your programs
WASM – a freeware can be download from internet
(http://user.mc.net/~warp/software_wasm.html)
Emu8086 (http:// www.emu8086.com)
The emu8086 consists of a tutorial and the reference for a complete instruction set.
During the lectures, this software is being used to demonstrate the program examples.
Keil www.keil.com
Assembly language Program
Assembly language program should be more effective and it will take up less memory
space and run faster. In realtime application, the use of assembly program is required
because program that is written in a highlevel language probably could not respond
quickly enough. The syntax for different microprocessor may be different but the concept
is the same so once you learn the assembly programming for one microprocessor, you
can easily program other kinds of system. For example, programming the 8051 series is
very similar to the 8086.
Defining data in a program
Data is stored in the data segment.

You can define constants, work areas (a chunk of memory ) in your program.
Data can be defined in different length (8-bit, 16-bit)
8-bit then use DB 16-bit then use DW
The definition for data:
[name] Dn expression
Name – a program that references a data item by means of a name.

The name of an item is otherwise optional
Dn – this is called the directives. It defines length of the data
Expression – define the values (content) for the data
Examples for data
FLDA DB ? ; define an uninitialized item called FLDA 8-bit
FLDB DB 25 ; initialize a data to 25
Define multiple data under the same name (like an array)
FLDC DB 21, 22, 23, 34 ; the data are stored in adjacent bytes
FLDC stores the first value

FLDC + 1 stores the second value
You can do mov AL, FLDC+3
DUP – duplicate
DUP can be used to define multiple storages
DB 10 DUP (?) ; defines 10 bytes not initialize
DB 5 DUP (12) ; 5 data all initialized to 12
What about DB 3 DUP( 5 DUP (4))

String :
DB ‘this is a test’
EQU – this directive does not define a data item; instead, it defines
a value that the assembler can use to substitute in other instructions
(similar to defining a constant in C programming or using the #define )
factor EQU 12
mov CX, factor
Addressing modes
Function of the addressing modes defines how to access operands of an instruction.
There are 9 modes: register addressing, immediate addressing, direct addressing, register
indirect addressing, based addressing, indexed addressing, based indexed addressing,
string addressing, and port addressing
To fetch the operand, require the BIU to do the read/write bus cycle to the memory
subsystem if necessary, different addressing modes provide different ways of
computing the address of an operand.
Example in C++ programming, you can do:
Int *x, y;
X = &y ; this is indirect addressing
OR
Int x[10], y;
*(X+1) = y ;
The above examples in C++ illustrate different syntax to obtain the data by different
addressing modes.
When using different addressing modes, you must clearly know how the offset address is
being calculated. With different kinds of addressing mode, the offset address may be
evaluated with different components, while the base address could come from either
the data segment or the extra segment.
Register addressing mode
The operand to be accessed is specified as residing in an internal register of the 8086
Eg MOV AX, BX
Move (MOV) contents of BX (the source operand), to AX (the destination operand)
In the above operation, both operands are in the internal registers and this is called
register addressing mode.
Immediate addressing mode
This involves the use of an immediate value.
Source operand (or the immediate value) is part of the instruction
Usually immediate operands represent constant data, the operands can be either a byte or
word
e.g MOV AL, 15
15 is a byte wide immediate source operand
Or it could be MOV AL, #15
The immediate operand is stored in program storage memory (i.e the code segment)
This value is also fetched into the instruction queue in the BIU and no external memory
bus cycle is initiated!
Direct addressing mode
This is also a simple addressing mode and it is for moving a byte or word between a
memory location and a register. The memory location most likely represents a variable.
The locations following the instruction opcode hold an Effective memory Address (EA)
instead of the data.
The EA is an offset address representing the storage location of the operand from the
current value in the data segment register
Physcial address = DS + offset
The instruction set does not support a memorytomemory transfer!
Example:
Mov AL, var1 ; move the content of var1 to AL register
Var1 db 12 ; define a memory location var1 to store the value 12
Then after the move AL = 12
In the above, the var1 can be regarded as a variable. In order to get the data represented
by var1, a memory read cycle is needed. Data is assumed to be stored in the data segment
(DS) and content of DS should be used as the segment address.
Register indirect addressing mode
This addressing mode is for transferring a byte or a word between a register and a
memory location addressed by an index or pointer register.
The effective address (EA) is stored either in a pointer register or an index register
The pointer register can be either the base register BX or base pointer register BP.
The index register can be the source index register SI, or the destination index register DI
The default segment is either DS or ES.
Refer to the following memory table
Eg MOV AL, [SI]

If SI = 01235H then after the move then AL = 18 according to the following memory
table.
Value stored in the SI register is used as the offset address. Using the offset address, you
can then obtain the Physical address of the data.
The segment register is DS in this example
In register indirect addressing mode, the EA (effective address) is a variable and depends
on the index, or base register value
Eg mov [BX], CL
If CL = 88 and BX = 01233 then after the move content of address 01233H becomes 88
Which segment register will be used for the above operation?
Table 1 Memory map
Address Content
(in HEX)
01236 19
01235 18
01234 20
01233
Baseplusindex addressing mode
This moves a byte or a word between a register and the memory location addressed by a
base register (BP or BX) plus an index register (DI or SI). You need a base value and an
index in order to extract the proper data.
Physical address of the operand is obtained by adding a direct or indirect displacement to
the contents of either base register BX or base pointer register BP and the current value in
DS and SS, respectively.
Eg MOV [BX+SI], AL

Move value in AL to a location (DS+BX+SI)
If BP is used then use SS register instead of DS
The base register (BX) often holds the beginning location of a memory array, while the
index register (SI) holds the relative position of an element in the array
Register relative addressing mode
For moving a byte or a word between a register and a memory location addressed by an
index or base register plus a displacement
Eg MOV AL, ARRAY[SI]

EA = value of SI + address of ARRAY
Physical address = EA + DS
Eg mov AX, [BX+4]
Eg mov AX, array[DI+3]
This is similar to the baseplusindex
Base relative plus index addressing mode
For the transfer of a byte or a word between a register and the memory location addressed
by a base and an index register plus a displacement, there are 3 components.
This addressing mode is a combination of the based addressing mode and the indexed
addressing mode together
Eg MOV AH, [BX+DI+4]

EA = value of BX + 4 + value of DI
Physical address = DS + EA
This can be used to access data stored as a 2-D matrix
Eg mov AX, array[BX+DI]
Summary of different addressing modes

MOV AL, BL Register addressing mode
MOV AL, 15H immediate
MOV AL, abc direct

MOV AL, [1234H]
MOV AL, [SI] Register indirect
MOV AL, [BX+SI] Base plus index
MOV AL, [BX+4] Register relative

MOV AL, ARRAY[3]
MOV AL, [BX+DI+4] Base relative plus index
MOV AL, ARRAY[BX+DI]
In register indirect addressing mode such as MOV AL, [SI] the value of SI represents an
address. How to move an address of a variable to a register?
This is achieved by the instruction LEA (load effective address).
LEA is similar to the following C++ syntax
int* x ;
x = &y ; // assign the address of y to point x
Syntax of LEA
LEA SI, ARRAY ; move the address of variable ARRAY to the SI register
MOV AL, [SI] ; value of 12 is moved to AL
ARRAY db 12 ; define a memory location ARRAY and 12 is the stored in it
Example
1. Select an instruction for each of the following tasks:
– Copy content of BL to CL
– Copy content of DS to AX
2. Suppose that DS=1100H, BX=0200H, LIST=0250H, and SI=0500H, determine the

physical address being accessed by each of the following instructions:
mov LIST[SI], DX
mov CL, LIST[BX+SI]
mov CH, [BX+SI]
String addressing mode
The string instructions of the 8086 instruction set automatically use the source (SI) and
destination index registers (DI) to specify the effective addresses of the source and
destination operands, respectively.
The instruction is MOVS
There is no operand after movs
Don’t need to specify the register but SI and DI are being used during the program
execution so you must set the value of SI and DI before you use MOVS.
Port addressing mode
Port addressing is used in conjunction with IN and OUT instructions to access input and
output ports. Any of the memory addressing modes can be used for the port address for
memory-mapped ports, details in the I/O section.
For ports in the I/O address space, only the direct addressing mode and an indirect
addressing mode using DX are available
Eg IN AL, 15H ; second operand is the port number

Input data from the input port at address 1516 of the I/O address space to register AL
Eg IN AL, DX
Load AL with data coming from a port, which number is stored in DX
Exercises
1. Compute the physical address for the specified operand in each of the following
instructions:
MOV [DI], AX (destination operand)
MOV DI, [SI] (source operand)
MOV XYZ[DI], AH (destination operand)
Given CS=0A00, DS=0B00, SI=0100, DI=0200,
BX=0300, XYZ=0400
2. Express the decimal numbers that follows as unpacked and packed BCD bytes (BCD –
binary coded decimal)
a. 29 b. 88
3. How would the BCD numbers be stored in memory starting at address 0B000
Example
Determine which is being moved in each “MOV” statement for the following assembly
program.
; first define a data segment

dat segment para 'data'
ival db 10H
array db 8, 2, 3, 4
dat ends
; COM file is loaded at CS:0100h
; define the code segment - instructions

CSEG segment 'code'
beng proc FAR
mov AX, dat ; assign data segment address to AX
mov DS, AX ; now DS is pointing to the data segment or DS stores the base
; address of the data segment
mov AX, BX ; register addressing mode

mov BL, ival ; direct addressing
mov AL, 7fH ; immediate addressing mode
mov BL, 04fH
add AL, BL ; AL = AL+BL
mov [10], AL ; direct addressing mode

mov CL, [11]
mov cl, array+2
LEA BX, array ; load effective address of array to BX
mov al, 2
mov dl, [bx] ; register indirect addressing
mov SI, 2
mov dh, [bx+SI] ; base+index addressing
mov ah, array[3] ; base relative addressing
mov al, [bx+si+1] ; base relative + index
mov cl, array+2
beng endp
CSEG ends
end beng
Determine values being moved in each operation.

Assembly Language Introduction

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Assembly Language Introduction

Caricato da

Copyright:

Formati disponibili

Assembly Language Programming

Advantages of using assembly programming language

Before you can write your program, you must know

How to learn programming – the CLP model

Assembly language programming

Data Registers in 8086

Syntax for defining different components in a program:

HLT ; Halt end of program

data db 45 ; just like a string

Defining data in a program

Data is stored in the data segment.

The definition for data:

Name – a program that references a data item by means of a name.

Examples for data

FLDA DB ? ; define an uninitialized item called FLDA 8-bit

FLDB DB 25 ; initialize a data to 25

Define multiple data under the same name (like an array)

FLDC stores the first value

You can do mov AL, FLDC+3

What about DB 3 DUP( 5 DUP (4))

Eg MOV AL, [SI]

Eg MOV [BX+SI], AL

Eg MOV AL, ARRAY[SI]

Eg MOV AH, [BX+DI+4]

Summary of different addressing modes

MOV AL, abc direct

MOV AL, [BX+SI] Base plus index

MOV AL, [BX+4] Register relative

ARRAY db 12 ; define a memory location ARRAY and 12 is the stored in it

2. Suppose that DS=1100H, BX=0200H, LIST=0250H, and SI=0500H, determine the

String addressing mode

Port addressing mode

Eg IN AL, 15H ; second operand is the port number

; first define a data segment

; COM file is loaded at CS:0100h

; define the code segment - instructions

mov AX, BX ; register addressing mode

mov [10], AL ; direct addressing mode

Determine values being moved in each operation.

Potrebbero piacerti anche