EEE-3103: Microprocessor and Interfacing
Lecture #10: Introduction to Assembly Language
Dr. Sharnali Islam
Department of Electrical and Electronic Engineering
University of Dhaka
[Link]@[Link]
Web resource for the slides: Introduction to Assembly Language Programming
A Hierarchy of Languages
Assembly Language and Machine Language
➢ Machine language
• Native to a processor: executed directly by hardware
• Instructions consist of binary code: 1s and 0s
➢ Assembly language
• Slightly higher-level language
• Readability of instructions is better than machine language
• One-to-one correspondence with machine language instructions
➢ Assemblers translate assembly to machine code
➢ Compilers translate high-level programs to machine code
• Either directly, or
• Indirectly via an assembler
Compiler and Assembler
Translating Languages
English: D is assigned the sum of A times B plus 10.
High-Level Language: D = A * B + 10
A statement in a high-level language is translated typically into
several machine-level instructions
Intel Assembly Language: Intel Machine Language:
MOV AX, A A1 00404000
MUL B F7 25 00404004
ADD AX, 10 83 C0 0A
MOV D, AX A3 00404008
Why Learn Assembly Languages
➢ Two main reasons:
• Accessibility to system hardware
• Space and time efficiency
➢ Accessibility to system hardware
• Assembly Language is useful for implementing system software
• Also useful for small embedded system applications
➢ Space and Time efficiency
• Understanding sources of program inefficiency
• Tuning program performance
• Writing compact code
Programmer’s View of Computer System
Increased level of Application Programs
abstraction High-Level Language Level 5
Assembly Language Level 4
Operating System
Level 3
Instruction Set
Architecture Level 2
Microarchitecture Level 1
Each level hides
the details of the
Digital Logic Level 0 level below it
Programmer’s View of Computer System
❖ Application Programs (Level 5)
• Written in high-level programming languages
• Such as Java, C++, Pascal, Visual Basic . . .
• Programs compile into assembly language level (Level 4)
❖ Assembly Language (Level 4)
• Instruction mnemonics are used
• Have one-to-one correspondence to machine language
• Calls functions written at the operating system level (Level 3)
• Programs are translated into machine language (Level 2)
❖ Operating System (Level 3)
• Provides services to level 4 and 5 programs
• Translated to run at the machine instruction level (Level 2)
Programmer’s View of Computer System
❖ Instruction Set Architecture (Level 2)
• Specifies how a processor functions
• Machine instructions, registers, and memory are exposed
• Machine language is executed by Level 1 (microarchitecture)
❖ Microarchitecture (Level 1)
• Controls the execution of machine instructions (Level 2)
• Implemented by digital logic (Level 0)
❖ Digital Logic (Level 0)
• Implements the microarchitecture
• Uses digital logic gates
• Logic gates are implemented using transistors
Constants
➢ Integer Constants
• Examples: –10, 42d, 10001101b, 0FF3Ah, 777o
• Radix: b = binary, d = decimal, h = hexadecimal, and o = octal
• If no radix is given, the integer constant is decimal
• A hexadecimal beginning with a letter must have a leading 0
➢ Character and String Constants
• Enclose character or string in single or double quotes
• Examples: 'A', "d", 'ABC', "ABC", '4096'
• Embedded quotes: "single quote ' inside", 'double quote " inside'
• Each ASCII character occupies a single byte
• Example: The string constant containing “349” is 3 bytes long.
Assembly Language Statements
❖ Three types of statements in assembly language
• Typically, one statement should appear on a line
1. Executable Instructions
• Generate machine code for the processor to execute at runtime
• Instructions tell the processor what to do
2. Assembler Directives
• Provide information to the assembler while translating a program
• Used to define data, select memory model, etc.
• Non-executable: directives are not part of instruction set
3. Macros
• Shorthand notation for a group of statements
• Sequence of instructions, directives, or other macros
Instructions
• Assembly language instructions have the format:
[label:] mnemonic [operands] [;comment]
• Instruction Label (optional)
▪ Marks the address of an instruction, must have a colon :
▪ Used to transfer program execution to a labeled instruction
• Mnemonic
▪ Identifies the operation (e.g. MOV, ADD, SUB, JMP, CALL)
• Operands
▪ Specify the data required by the operation
▪ Executable instructions can have zero to three operands
▪ Operands can be registers, memory variables, or constants
Instruction Examples
✓ No operands
STC ; set carry flag
✓ One operand
INC AX ; increment register AX
JMP L1 ; jump to instruction with label L1
✓ Two operands
ADD BX, CX ; register BX = BX + CX
SUB var1, 25 ; memory variable var1 = var1 – 25
✓ Three operands
IMUL AX, BX,5 ; register AX = BX * 5
Directives
A directive is a statement that affects either the program listing or the way machine code is generated.
TITLE Flat Memory Program Template ([Link])
; Program Description:
; Author: Creation Date:
; Modified by: Modification Date:
DATA
; (insert variables here)
CODE
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main
Directives
❖ TITLE line (optional)
• Contains a brief heading of the program and the disk file name
❖ DATA directive
• Defines an area in memory for the program data
• The program’s variables should be defined under this directive
• Assembler will allocate and initialize the storage of variables
❖ CODE directive
• Defines the code section of a program containing instructions
• Assembler will place the instructions in the code area in memory
Directives
❖ PROC and ENDP directives
• Used to define procedures
• As a convention, we will define main as the first procedure
• Additional procedures can be defined after main
❖ END directive
• Marks the end of a program
• Identifies the name (main) of the program’s startup procedure
Procedures
➢ Set of program statements that can be processed independently.
Variable and labels defined in procedure are local.
General Form:
ProcedureName PROC [NEAR/FAR]
; save all registers that get modified using PUSH
…
; procedure codes
…
; Restore all registers that were saved in stack
RET; Return to calling program using RET instruction
ProcedureName ENDP
PROC → directive that indicates beginning of procedure.
ENDP → directive that indicates end of procedure.
Procedures must be defined within code segment only.
Procedures
Procedure Call:
CALL → transfer control to subprogram or procedure.
saves return address on stack.
two types → Intra-Segment or near call.
Inter-Segment or far call.
General Form:
CALL ProcedureName
Procedure Call and Return
MAIN PROC
CALL PROC1
next instruction
PROC1 PROC
First instruction
RET
Procedures
Procedure Examples:
a) HEX2ASC PROC NEAR b) IFACT PROC FAR
… …
; procedure code ; procedure code
… …
RET RET
HEX2ASC ENDP IFACT ENDP
Call to procedure: Call to procedure:
CALL HEX2ASC CALL FAR PTR IFACT
Macros
General Form:
MacroName MACRO [Argument1, …, ArgumentN]
…
; Body of macro: Program text to be declared as macro
…
ENDM
MACRO = beginning of macro.
ENDM = end of macro.
Body = definitions, declarations, or small part of codes.
Codes are substituted while translating them to machine code.
Invoking Macro:
MacroName Arguments
Macros
MACRO Example:
PrintString MACRO msg
mov ah, 09H ; AH=display string function
mov dx, offset msg ; DX=offset of a data item msg
int 21H ; call DOS service
ENDM
Invoking Macro:
msg1 db ‘Hello everyone!$’
PrintString msg1
After assembling→
mov ah, 09H
mov dx, offset msg1
int 21H
How Assembly Language Works
Data Defining Statement
• Sets aside storage in memory for a variable
• May optionally assign a name (label) to the data
• Syntax:
[name] directive initializer [, initializer] . . .
val dB 10
• All initializers become binary data in memory
Name
✓ The name field is used for instruction labels, procedure names, and variable names.
✓ The assembler translates names into memory addresses.
✓ Names can be from 1 to 31 characters long and may consist of letters, digits, and the
special characters ?, ., @, _, $, %.
✓ Embedded blanks are not allowed.
✓ If a period is used, it must be the first character.
✓ Names may not begin with a digit.
✓ The assembler does not differentiate between uppercase and lowercase in a name.
Name
Examples of legal names: Examples of illegal names:
• COUNTER1 1. TWO WORDS: contains a blank
• TOTAL@CLR 2. 2abc: begins with a digit
• SUM_OF_DIGITS 3. A45.28: period not as the first
• $A000 character
• DONE? 4. YOU&ME: contains an illegal
• .TEST character
Data Types
Array
An array is a chain of variables.
Example:
a DB 48h, 65h, 6Ch, 6Ch, 6Fh, 00h
This chart shows a part of the memory where these arrays are declared
You can access the value of any element in array using square brackets, for example:
MOV AL, a[3]
Array
✓ If you need to declare a large array you can use DUP operator.
The syntax for DUP:
number DUP ( value(s) )
number - number of duplicate to make (any constant value).
value - expression that DUP will duplicate.
Example:
1. c DB 5 DUP(9)
is an alternative way of declaring:
c DB 9, 9, 9, 9, 9
2. d DB 5 DUP(1, 2)
is an alternative way of declaring:
d DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
Defining Strings
• A string is implemented as an array of characters
• For convenience, it is usually enclosed in quotation marks
• It is often terminated with a NULL char (byte value = 0)
• Examples:
str1 dB "Enter your name", 0
str2 dB 'Error: halting program', 0
str3 dB 'A','E','I','O','U'
greeting dB "Welcome to the Encryption "
Defining Strings
• To continue a single string across multiple lines, end each line with a comma
menu dB "Checking Account",0dh,0ah,0dh,0ah,
"1. Create a new account",0dh,0ah,
"2. Open an existing account",0dh,0ah,
"3. Credit the account",0dh,0ah,
"4. Debit the account",0dh,0ah,
"5. Exit",0ah,0ah,
"Choice> ",0
❖ End-of-line character sequence:
0Dh = 13 = carriage return
0Ah = 10 = line feed
LABEL Directive
• Assigns an alternate name and type to a memory location
• LABEL does not allocate any storage of its own
• LABEL acts as place marker when a program needs to jump from one location to another
• Format: Name LABEL Type
MOV AX, 5 ; set AX to 5.
MOV BX, 2 ; set BX to 2.
JMP calc ; go to 'calc’.
back: JMP stop ; go to 'stop’.
calc:
ADD AX, BX ; add BX to AX.
JMP back ; go 'back’.
stop:
EQU Directive
• Three Formats:
Name EQU Expression Integer constant expression
Name EQU Symbol Existing symbol name
Name EQU <text> Any text may appear within < …>
SIZE EQU 10*10 ; Integer constant expression
PI EQU <3.1416> ; Real symbolic constant
• Note: No memory is allocated for EQU names