0% found this document useful (0 votes)
17 views49 pages

Assembly Language Basics Explained

Uploaded by

mersha abdisa
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views49 pages

Assembly Language Basics Explained

Uploaded by

mersha abdisa
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Assembly language fundamentals

Chapter four
Outline
 Introduction to Assembly Language
 Basic Elements of Assembly Language
 Assembly, Machine, and High-Level Languages
 Defining Data Types
 Assembly Directives
 Assembly Language Programming Tools
Introduction
Levels of Programming Languages
1) Machine Language
 Consists of individual instructions that will be executed by the CPU one at a
time
2) Assembly Language (Low Level Language)
 Designed for a specific family of processors (different processor
groups/family has different Assembly Language)
 Consists of symbolic instructions directly related to machine language

instructions one-for-one and are assembled into machine language.


3) High Level Languages
 e.g. : C, C++ and Vbasic

 Designed to eliminate the technicalities of a particular computer.


 Statements compiled in a high level language typically generate many

low-level instructions.
 HLL programs are machine independent.
 They are easy to learn, easy to use, and convenient for managing complex tasks.
Advantages of Assembly Language
1. Shows how program interfaces with the
processor, operating system, and BIOS(basic
input/output system).
2. Shows how data is represented and stored in
memory and on external devices.
3. Clarifies how processor accesses and
executes instructions and how instructions
access and process data.
4. Clarifies how a program accesses external
devices.
Reasons for using Assembly Language
1. A program written in Assembly Language requires
considerably less memory and execution time than one
written in a high –level language.
2. Assembly Language is useful for implementing system
software and also useful for small embedded system
applications
3. Assembly Language gives a programmer the ability
to perform highly technical tasks that would be
difficult, if not impossible in a high-level language.
4. Although most software specialists develop new applications
in high-level languages, which are easier to write and
maintain, a common practice is to recode in assembly
language those sections that are time-critical.
5. Resident programs (that reside in memory while
Assembly vs HLL
Basic Elements of Assembly Language
 Integer constants
 Integer expressions
 Character and string constants
 Reserved words and identifiers
 Directives and instructions
 Labels
 Mnemonics and Operands
 Comments
 Examples

7
Integer Constants
 Optional leading + or – sign
 binary, decimal, hexadecimal, or octal digits
 Common radix characters:
 h – hexadecimal
 d – decimal
 b – binary
 r – encoded real
Examples: 30d, 6Ah, 42, 1101b
Hexadecimal beginning with letter: 0A5h 8
Integer Expressions
 Operators and precedence levels:

 Examples:

9
Character and String Constants
 Enclose character in single or double quotes
 'A', "x"
 ASCII character = 1 byte
 Enclose strings in single or double quotes
 "ABC"
 'xyz'
 Each character occupies a single byte
 Embedded quotes:
 'Say "Goodnight," Gracie'
10
Reserved Words and Identifiers
 Reserved words cannot be used as identifiers
 Instruction mnemonics(such as MOV, ADD, and MUL),

directives, type attributes, operators, predefined symbols


 Identifiers
 1-247 characters, including digits

 case insensitive (by default)

 first character must be a letter, _, @, or $

 Examples: var1, Count, $first, _main, MAX ,


open_file, xVal

11
Directives
 Commands that are recognized and acted upon by the
assembler
 Not part of the Intel instruction set
 Used to declare code, data areas, select memory model,
declare procedures, etc.
 E.g. myVar DWORD 26 ; DWORD directive
 move ax, myVar ; MOV instruction
 Different assemblers have different directives
 NASM != MASM, for example

12
Directives
 In MASM, directives are case insensitive.
 different types directives
 Defining Segments: One important function of assembler directives is to define
program section, or segments.
 .DATA directive identifies the area of a program containing variables:
 .data
 .CODE directive identifies the area of a program containing
instructions:
 .code

.STACK directive identifies the area of a program holding the runtime stack, setting its
size:
 .stack 1000h
Directives
 Proc:  End :
 Directive identifies the beginning of a  Directive marks the last line of the program to be
procedure assembled. It identifies the name of the program’s
startup procedure (the procedure that starts the
 Endp: program execution.) Procedure main is the startup
procedure.
 Directive marks the end of the  Title:
procedure  Directive marks the entire line as a comment

 .model
 directive instructs the assembler to generate
code for a protected mode program, and
STDCALL enables the calling of MS-
Windows functions.
 Flat, small
Instructions
 An instruction is a statement that becomes executable 
We use the Intel IA-32 instruction set
when a program is assembled.
Syntax:
 Instructions are translated by the assembler into
machine language bytes, which are loaded and [label] mnemonic(opcode) operand(s) [;comment]
executed by the CPU at run time. label optional
 The major two fields are: instruction mnemonic required: such as MOV, ADD,
SUB, MUL
 Opcode field which stands for operation code
operands usually required
and it specifies the particular operation that is to
comment optional
be performed.  An instruction contains:
 Each operation has its unique opcode.
 Label
 Operands fields which specify where to get the
source and destination operands for the
 Mnemonic
operation specified by the opcode.  Operand
 The source/destination of operands can be a constant,  Comment
the memory or one of the general-purpose registers. 15
Labels
 Act as place markers  Code label
 marks the address (offset) of  target of jump and loop
code and data instructions
 Follow identifier rules  example: L1:

MOV ax, bx …
 Data label 
JMP L1
 must be unique
 (followed by colon)
 example:

count DWORD 100

(not followed by colon)

16
Mnemonics and Operands
 Instruction Mnemonics
 "reminder"
 examples: MOV, ADD, SUB, MUL, INC, DEC
 Operands
 constant (immediate value i.e. 4 or 0-9)
 constant expression(2*4)
 Register(eax, ax)
 memory (data label)
17
Comments
 Comments are good!  Multi-line comments
 explain the program's purpose
 begin with COMMENT directive and
a programmer-chosen character
 when it was written, and by  end with the same programmer-
whom chosen character
 revision information  Example:
 tricky coding techniques
 application-specific  COMMENT ! This is a comment.
explanations This line is also a comment. !
 Single-line comments
 begin with semicolon (;)
Instruction Format Examples
 No operands
 stc ; set Carry flag
 One operand
 inc eax ; register
 inc myByte ; memory
 Two operands
 add ebx,ecx ; register, register
 sub myByte,25 ; memory, constant
 add eax,36 * 25 ; register, constant-expression
Suggested Coding Standards
 Some approaches to capitalization
 capitalize nothing
 capitalize everything
 capitalize all reserved words, including instruction mnemonics and
register names
 capitalize only directives and operators
 Other suggestions
 descriptive identifier names
 spaces surrounding arithmetic operators
 blank lines between procedures
Suggested Coding Standards
 Indentation and spacing
 code and data labels – no indentation
 executable instructions – indent 4-5 spaces
 comments: begin at column 40-45, aligned vertically
 1-3 spaces between instruction and its operands

ex: mov ax,bx
 1-2 blank lines between procedures
Program Template

TITLE Program Template ([Link])

; Program Description:

; Author:

; Creation Date:

; Revisions:

; Date: Modified by:

INCLUDE [Link]

.data; (insert variables here)

.code

main PROC ; (insert executable instructions here)

exit ;exit to operating system

main ENDP; (insert additional procedures here)

END main
Example: Adding and Subtracting Integers

TITLE Add and Subtract ([Link])

; This program adds and subtracts 32-bit integers.

INCLUDE [Link]

.code

main PROC

mov eax,10000h ; EAX = 10000h

add eax,40000h ; EAX = 50000h

sub eax,20000h ; EAX = 30000h

call DumpRegs ; display
registers//EAX=00030000

exit

main ENDP

END main
Assembly Language Programming Tools
 Software tools are needed for editing, assembling, linking, and
debugging assembly language programs
 An assembler is a program that converts source-code
programs written in assembly language into object files in
machine language
 Popular assemblers includes …
 TASM (Turbo Assembler from Borland)
 NASM (Netwide Assembler for both Windows and Linux), and
 GNU assembler distributed by the free software foundation
 MASM- Microsoft Macro Assembler
Linker and Link Libraries
 You need a linker program to produce executable files
 It combines your program's object file created by the assembler with
other object files and link libraries, and produces a single executable
program
 [Link] is the linker program provided with the MASM distribution
for linking 32-bit programs
 We will also use a link library for input and output
 Called [Link] developed by Kip Irvine
 Works in Win32 console mode under MS-Windows
Assemble and Link Process
Source Object
File Assembler File

Source Object Executable


File Assembler File Linker
File

Source Object Link


File Assembler File Libraries

 A project may consist of multiple source files


 Assembler translates each source file separately into an object file
 Linker links all object files together with link libraries
Debugger
 Allows you to trace the execution of a program
 Allows you to view code, memory, registers, etc.
 Example: 32-bit Windows debugger
Editor
 Allows you to create assembly language source files
 Some editors provide syntax highlighting features and
can be customized as a programming environment
 Notepad, visual studio 2010 C++ express
Defining Data
 Intrinsic Data Types
 Data Definition Statement
 Defining BYTE and SBYTE Data
 Defining WORD and SWORD Data
 Defining DWORD and SDWORD Data
 Defining QWORD Data
 Defining TBYTE Data
 Defining Real Number Data
 Little Endian Order
 Declaring Uninitialized Data
Intrinsic Data Types
 Intrinsic Data Types  QWORD
 BYTE, SBYTE ƒ 8-bit unsigned

64-bit integer
integer;
 TBYTE

80-bit integer
 8-bit signed integer
 WORD, SWORD
 REAL4

4-byte IEEE short real

16-bit unsigned & signed integer
 DWORD, SDWORD
 REAL8

8-byte IEEE long real

32-bit unsigned & signed integer
 REAL10

10-byte IEEE extended real
Data Definition Statement
 A data definition statement sets  Defining BYTE and SBYTE
aside storage in memory for a Data
variable. o May optionally assign a
name (label) to the data o Syntax:
 [name] directive initializer [,initializer]
...
 Example: value1 BYTE 10  Defining Byte Arrays
 All initializers become binary data in • Examples: use multiple initializers
memory
list1 BYTE 10, 20, 30, 40
 Defining Strings  Examples:
 A string is implemented as an array  str1 BYTE "Enter your name", 0
of characters  str2 BYTE 'Error: halting program',
 For convenience, it is usually 0
enclosed in quotation marks  str3 BYTE 'A','E','I','O','U'
 It often will be null-terminated  greeting BYTE "Welcome to the
(containing 0). Strings of this type are Encryption program " BYTE
used in C, C++, and Java programs. "created by Kip Irvine.", 0
 Using the DUP Operator
 Use DUP to allocate (create space for) an array or string.
 Syntax:
 counter DUP ( argument )
 Counter and argument must be constants or constant
expressions
 Examples:
 Defining WORD and SWORD Data • Defining DWORD and SDWORD Data
 Defining WORD and SWORD Data o  Defining DWORD and SDWORD Data
Define storage for 16-bit integers, single  Storage definitions for signed and unsigned
value or multiple valus 32-bit integers

 Defining QWORD, TBYTE, Real Number Data


 Defining QWORD, TBYTE, Real Data
 Storage definitions for quadwords, tenbyte
values, and real numbers
Little Endian Order
 Little Endian Order:  Big Endian Order
 All data types larger than a byte store
their individual bytes in reverse order
 val1 DWORD 12345678h
 The least significant byte occurs at the
first (lowest) memory address
 Example:
 val1 DWORD 12345678h


Symbolic Constants
 Associate and identifier (a  Equal-Sign Directive
 Syntax: name = expression
symbol) with an integer  expression is a 32-bit integer (expression or
expression or some text 
constant)
may be redefined
 Symbols do not reserve  name is called a symbolic constant

storage  good programming style to use


 Used only by the assembler symbols
 COUNT = 500 . .
when scanning a program  mov al, COUNT
 Cannot change at run time 
EQU Directive TEXTEQU Directive
 Define a symbol as either an integer or text

expression.
 Define a symbol as either an integer or
 Cannot be redefined text expression.
 Syntax  Called a text macro
 name EQU expression ; integer expression  Can be redefined
 name EQU symbol ; existing symbol name
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
 name EQU <text> ; any text
rowSize = 5
 Example .data
matrix EQU 10 * 10 prompt1 BYTE continueMsg
PI EQU <3.1416> count TEXTEQU %(rowSize * 2) ; evaluates the expression
pressKey EQU <"Press any key to continue...",0> setupAL TEXTEQU <mov al,count>
.data .code
prompt BYTE pressKey setupAL ; generates: "mov al,10"
MI WORD matrix
Outline( lab)
 Tools and setups
 Notepad, notepad++, any other
 Assembler (MASM)
 Linker
 Assembling linking
 Step in execution
 Registers and memory
Tools
 DOSBox: download DOSBox 0.74 and install.
 Notepad: we can use notepad editor, in addition there are
another editor like visual studio c++ express and other.
 Assembler (MASM): its task is to assemble the programs written
in assembly language. It generate an object file for separate files.
 Alternatively you can download 8086 MASM assembler which
contains all the tools in it.
 Linker: link the object file with the link library
 Debug: it helps for different purpose
How to run?
 First write the code in notepad, for simplicity.
 Open your DOSBox
 To mount to the directory where your file is found write in the
command as

Mount c c:\ or directly to your folder as, mount c c:/foldername

Then, type c:

Then it would be in your directory.
 Then to assemble, type masm [Link], enter until c:\> is seen.
 Then to link, type link filename
 Finally use debug or afdebug command to execute the program. C:\
>debug [Link]

-t ; for single step execution

-g ; for at a time execution

-I ; for restarting the program execution

-d ; to see the data segment

-q ; to quit the execution

-g ; complete execution of program in single step.

-t ; Stepwise execution.

-d ds: starting address or ending address ; To see data in memory locations

-p ; Used to execute interrupt or procedure during stepwise execution of program
 Or just type [Link] and enter then press ? And enter.
 Or just type , [Link]
 Then type, ? Then the following command will be
shown.
Example
 You can write codes in the [Link] command
 E.g. addition of two numbers
 Push, pop and xchg
Decrement, subtraction and increment
 First once you mount c:\>8086> [Link]
Other way
 E.g displaying the text “hello world”
Addition of two numbers
Interrupt instructions
 MS-DOS uses INT 21H for its main API functions which provide a
low-level interface to the devices-reading input from
keyboard, writing to terminal, create/read/write files and directories
etc. MS-DOS uses other interrupts to provide other services.
 INT is an assembly language instruction for x86 processors that
generates a software interrupt. ... For example, INT 21H will generate
the software interrupt0x21 (33 in decimal), causing the function
pointed to by the 34th vector in theinterrupt table to be executed,
which is typically an MS-DOS API call.
 INT 03H: Breakpoint Interrupt. The INT 03H vector is used by
debugging utilities in order to intercept execution when it reaches a
user-selected address. The opcode for INT 03H is one byte (c0H), so
it can lay over top of the start of any CPU instruction, without any
chance of overwriting the code that follows it.
 mov ah,4ch is the first line of assembler code. The value 4C in
hexadecimal is stored in the register AH. int 21h is the second line
of assembler code. The software interrupt 21h is called. This
interrupt, when given the value of 4ch in AH (as is the case here),
causes the program to exit immediately.

You might also like