Sei sulla pagina 1di 62

Department of Electrical and computer Engineering

College of Engineering and Technology


Jimma University
 Introduction to assembly language
 Syntax of assembly language language
program
 Assembly language general program structure
 component of assembly language
 Directives
 Identifiers
 Some Example of assembly language
programs

2
 Three types of computer program
 Machine language
 Assembly language
 High level language
 Machine Language:
 Set of fundamental instructions the machine can
execute
 Expressed as a pattern of 1’s and 0’s
 Can be run directly in the processors
 Assembly Language:
 Alphanumeric equivalent of machine language
 Mnemonics more human-oriented than 1’s and 0’s
 Use assembler which convert assembly language to
machine language
 High level language
 Closer to human language
 It is human friendly
 It is application specific
 Use compiler/interpreter
 Assembler:
 Computer program that transliterates (one-to-
one mapping) assembly to machine language

4
 Educational value
 Learning assembly language programming will help
understanding the operations of the microprocessor
 Faster and shorter programs.
 Compilers do not always generate optimum code.
 Small controllers embedded in many
products
 Have specialized functions,
 Rely so heavily on input/output functionality,
 HLLs inappropriate for these types of product
development.
 E.g.Game developers, virus/untivirus etc.

5
 Assembly language program must be translated to machine language for
the target processor.
 The following diagram describes the steps from creating a source program
through executing the compiled program.
 If the source code is modified, Steps 2 through 4 must be repeated.

Link
Library
Step 2: Step 3: Step 4:
Source assembler Object linker Executable OS loader
Output
File File File

Listing Map
Step 1: text editor File File

6
 MASM
 Microsoft : Macro Assembler
 TASM
 Borland : Turbo Assembler
 NASM
 Library General Public License (LGPL) [Free] :
Netwide Assembler
 etc, Flat Assembler, SpAssembler
 #include<iostream>
 Using namespace std
 Int main()
{
 Int x,y;
 Cout<<“hello world\n”;
 Return 0;
}

8
 .model small
 .stack 100h
 .data
 message db 'Hello World', 13, 10, '$'
 .code
 start:
 mov ax, @data
 mov ds, ax
 mov dx, offset message ; copy address of message to dx
 mov ah, 9h ; string output
 int 21h ; display string
 mov ax, 4c00h
 int 21h
 end start
10
TITLE PRGM1
.MODEL SMALL
.STACK 100H
.DATA
A DW 2
B DW 5
SUM DW ?
.CODE
MAIN PROC
; initialize DS
MOV AX, @DATA
MOV DS, AX
; add the numbers
MOV AX, A
ADD AX, B
MOV SUM, AX
; exit to DOS
MOV AX, 4C00H
INT 21H
MAIN ENDP
END MAIN
 An instruction is a statement that becomes
executable when a program is assembled.
 In assembly language, the assembler assemble one
line at a time
 And the syntax for each line consists of:

 [label:] mnemonic [operands] [ ; comment]


 Each line of the assembly language consists of
 Label (optional)
 Mnemonic (required)
 Operand (depends on the instruction)
 Comment (optional)
 Act as place markers
 marks the address (offset) of code and data
 Follow identifier rules
 It is optional
 There are two types of labels
 Data label (identifier, data segment label
 myArray (not followed by colon)
 count DWORD 100
 Code label (code segment label, function name)
 target of jump and loop instructions
 example: L1: (followed by colon)
mov ax, bx
jmp L

14
 Instruction Mnemonics (opcode, instruction set)
 memory aid
 examples: MOV, ADD, SUB, MUL, INC, DEC
 Operands
 constant 96
 constant expression 2+4
 register ax
 memory (data label) count
Constants and constant expressions are often called immediate values
Comments are good!
 explain the program's purpose
 when it was written, and by whom
 revision information
 tricky coding techniques
 application-specific explanations
In assembly language, Single-line
comments
 begin with semicolon (;)
 Directives are Commands that are recognized
and acted upon by the assembler
 Not part of the Intel instruction set
 Used to declare code, data areas, select memory
model, declare procedures, etc.
 not case sensitive

 Different assemblers have different directives


 NASM not the same as MASM, for example
 Equ directive (Named Constants)
 EQU pseudo-op used to assign a name to
constant.
 Makes assembly language easier to understand.
 No memory is allocated for EQU names.
 LF EQU 0AH
 MOV DL, 0AH
 MOV DL, LF
 PROMPT EQU “Type your name”
 MSG DB “Type your name”
 MDG DB PROMPT
 DUP Operator
 Used to define arrays whose elements share
common initial value.
 It has the form: repeat_count DUP (value)
 E.g. Numbers DB 100 DUP(0)
 Allocates an array of 100 bytes, each initialized to
0.
 Names DW 200 DUP(?)
 Allocates an array of 200 uninitialized words.

 Two equivalent definitions


 Line DB 5, 4, 3 DUP(2, 3 DUP(0), 1)
 Line DB 5, 4, 2, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 0, 0,
0, 1
 PTR Operator
 Used to override declared type of an address
expression.
 Examples:
 MOV [BX], 1 illegal, there
is ambiguity
 MOV Bye PTR [BX], 1 legal
 MOV WORD PTR [BX], 1 legal
 Let j be defined as follows
 j DW 10
 MOV AL, j illegal
 MOV AL, Byte PTR J legal
 Identifiers
 Programmer-chosen name to identify a variable, constant,
procedure, or code label
 1-247 characters, including digits
 not case sensitive
 first character must be a letter, _, @, ?, or $
 Subsequent characters may also be digits
 Cannot be the same as a reserved word
 @ is used by assembler as a prefix for predefined symbols,
so avoid it identifiers
 Examples
 Var1, Count, $first, _main, MAX, open_file, myFile, xVal,
_12345
 Reserved words cannot be used as identifiers
 Instruction mnemonics
 MOV, ADD, MUL,, …
 Register names
 Directives – tells MASM how to assemble programs
 type attributes – provides size and usage information
 BYTE, WORD
 Operators – used in constant expressions
 predefined symbols – @data
 A data definition statement sets aside storage in memory for a
variable.
 May optionally assign a name (label) to the data
 Syntax:
[name] directive initializer [,initializer] . . .

value1 BYTE 10

 All initializers become binary data in memory


 Each variable has a data type and assigned a
memory address.
 Data-defining pseudo-ops
 DB define byte
 DW define word
 DD define double word (two consecutive
words)
 DQ define quad word (four consecutive words)
 DT define ten bytes (five consecutive words)
 Each pseudo-op can be used to define one or
more data items of given type.
 Assembler directive format defining a byte
variable
 name DB initial value
 I DB 4 define variable I with initial value 4
 J DB ? Define variable J with uninitialized value
 a question mark (“?”) place in initial value leaves
variable uninitialized
K DB 5, 3, -1 allocates 3 bytes
K 05
03
FF
Offset Value
0000 10
list1
0001 20

Examples that use 0002 30

multiple initializers: 0003 40


0004 10
list2
list1 BYTE 10,20,30,40 0005 20
list2 BYTE 10,20,30,40 0006 30
BYTE 50,60,70,80 0007 40
BYTE 81,82,83,84 0008 50
0009 60
list3 BYTE ?,32,41h,00100010b
000A 70
list4 BYTE 0Ah,20h,‘A’,22h
000B 80
000C 81
000D 82
000E 83
000F 84
list3 0010
 Assembler directive format defining a word
variable I 04
 name DW initial value 00

I DW 4
J FE
FF
J DW -2
K BC
1A
K DW 1ABCH
L 31
30
L DW “01”
A string is implemented as an array of characters
 For convenience, it is usually enclosed in quotation marks
 It often will be null-terminated (ending with ,0)
 Examples:

str1 BYTE "Enter your name",0


str2 BYTE 'Error: halting program',0
str3 BYTE 'A','E','I','O','U'
greeting BYTE "Welcome to the Encryption Demo program "
 End-of-line character sequence:
 0Dh = carriage return
 0Ah = line feed
str1 BYTE "Enter your name: ",0Dh,0Ah
BYTE "Enter your address: ",0

newLine BYTE 0Dh,0Ah,0

Idea: Define all strings used by your program in the same area of
the data segment.

31
 CPU communicates with peripherals through I/O
registers called I/O ports.
 Two instructions access I/O ports directly: IN and
OUT.
 Used when fast I/O is essential, e.g. games.
 Most programs do not use IN/OUT instructions
 port addresses vary among computer models
 much easier to program I/O with service routines
provided by manufacturer
 Two categories of I/O service routines
 Basic input/output system (BIOS) routines
 Disk operating system (DOS) routines
 DOS and BIOS routines invoked by INT (interrupt)
instruction.
System Hardware

Non-standard interface

BIOS

Standard interface

Operating System

Standard interface

Application Program
 INT21H used to invoke a large number of
DOS function.
 Type of called function specified by putting a
number in AH register.
 AH=1 single-key input with echo
 AH=2 single-character output
 AH=9 character string output
 AH=8 single-key input without echo
 AH=0Ah character string input
 Input: AH=2, DL= ASCII code of character to be
output
 Output: AL=ASCII code of character
 To display a character
 MOV AH, 2
 MOV DL, ‘?’ ; displaying character ‘?’
 INT 21H
 To read a character and display it
 MOV AH, 1
 INT 21H
 MOV AH, 2
 MOV DL, AL
 INT 21H
36
 Input:AH=1
 Output: AL= ASCII code if character key is
pressed, otherwise 0.
 To input character with echo:
 MOV AH, 1
 INT 21H ; read character will be in AL register
 To input a character without echo:
 MOV AH, 8
 INT 21H ; read character will be in AL register
38
 Input: AH=9, DX= offset address of a string.
 String must end with a ‘$’ character.
 To display the message Hello!
 MSG DB “Hello!$”
 MOV AH, 9
 MOV DX, offset MSG
 INT 21H
 OFFSET operator returns the address of a
variable
 The instruction LEA (load effective address)
loads destination with address of source
 LEA DX, MSG
 .model small
 .stack 100h
 .data
 message db 'Hello World', 13, 10, '$'
 .code
 start:
 mov ax, @data
 mov ds, ax
 mov dx, offset message ; copy address of message to dx
 mov ah, 9h ; string output
 int 21h ; display string
 mov ax, 4c00h
 int 21h
 end start

40
 Input: AH=10, DX= offset address of a buffer to
store read string.
 First byte of buffer should contain maximum string
size+1
 Second byte of buffer reserved for storing size of read
string.
 To read a Name of maximum size of 20B display
it
 Name DB 21,0,22 dup(“$”)
 MOV AH, 10
 LEA DX, Name
 INT 21H
 MOV AH, 9
 LEA DX, Name+2
 INT 21H
 Prompt the user to enter a lowercase letter,
and on next line displays another message
with letter in uppercase.
 Enter a lowercase letter: a
 In upper case it is: A
 .DATA
 CR EQU 0DH
 LF EQU 0AH
 MSG1 DB ‘Enter a lower case letter: $’
 MSG2 DB CR, LF, ‘In upper case it is: ‘
 Char DB ?, ‘$’
 .CODE
 .STARTUP ; initialize data segment
 LEA DX, MSG1 ; display first message
 MOV AH, 9
 INT 21H
 MOV AH, 1 ; read character
 INT 21H
 SUB AL, 20H ; convert it to upper case
 MOV CHAR, AL ; and store it
 LEA DX, MSG2 ; display second message and
 MOV AH, 9 ; uppercase letter
 INT 21H
 .EXIT ; return to DOS
 String DB “COE-205”

MOV CX, 7 ; CX contains length of string


XOR BX, BX
Next: MOV AL, String[BX]
PUSH AX
INC BX
LOOP Next
MOV CX, 7
XOR BX, BX
Next2: POP AX
MOV String[BX], AL
INC BX
LOOP Next2
.DATA
String1 DB “Hello”
String2 DB 5 dup(?)
.CODE
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
CLD
MOV CX, 5
LEA SI, String1
LEA DI, String2
REP MOVSB
.DATA
String1 DB “Hello”
String2 DB 5 dup(?)
.CODE
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
STD
MOV CX, 5
LEA SI, String1+4
LEA DI, String2
Next: MOVSB
ADD DI, 2
LOOP Next
47
50
51
Procedure
 Used to define subroutines, offers modular programming.
 Call to procedure will be a transfer of control to called
procedure during run time.
PROC: indicates beginning of procedure.
Procedure type helps assembler to decide weather to code
return as near/far.
Near/Far term follows PROC indicates type of procedure.[Near
by default]
ENDP: indicates assembler the end of procedure
Example: Procedure Name PROC

;do procedure code stuuff

RET
Procedure Name ENDP
 Procedure Declaration
Name PROC type
;body of the procedure

RET
Name ENDP
 Procedure type
 NEAR (statement that calls procedure in same
segment with procedure)
 FAR (statement that calls procedure in different
segment)
 Default type is near
 Procedure Invocation
 CALL Name
 Executing a CALL instruction causes
 Save return address on the stack
 Near procedure: PUSH IP
 Far procedure: PUSH CS; PUSH IP
 IP gets the offset address of the first instruction of
the procedure
 CS gets new segment number if procedure is far
• Executing a RET instruction causes
 Transfer control back to calling procedure
 Near procedure: POP IP
 Far procedure: POP IP; POP CS
 RET n
 IP  [SP+1:SP]
 SP  SP + 2 + n

55
56
57
MACRO definition directive
 Used to define macro constants.
 Call to macro will be replaced by its body during
assembly time.
 MACRO: informs assembler the beginning of macro. It is a
open subroutines. It gets expanded when call is made to
it.
 MacroName MACRO [arg1,arg2…argn]
 Advantage: save great amount of effort and time by
avoiding overhead of writing repeated pattern of code.
 ENDM: informs assembler the end of macro.
61
62

Potrebbero piacerti anche