Memory Allocation
Adelaiye O.I
Objectives
Data allocation
Data types and sizes
Pointers to objects in memory
MOV instruction, copying data
Exchange Instructions
sign-extending integer
Allocating Memory for
Integers
Intel x86 CPU performs operations on different
sizes of data.
An integer is a whole number with no fractional
part.
In assembler, the variables are created by data
allocation directives.
Assembler declaration of integer variable assigns
a label to a memory space allocated for the
integer.
Allocating Memory Cont’d
The variable name becomes a label for the
memory space. For example,
MyVar db 77h ; byte-sized variable called MyVar
initialised to 77h where
MyVar is variable name
db is directive for byte-sized memory allocation
77h is initializer specifying initial value.
Endian
Different processors store multi-byte integers in
different orders in memory.
There are two popular methods of storing
integers: big endian and little endian.
Big endian method is the most natural:
the biggest (i.e. most significant) byte is stored first,
then the next biggest, etc.
IBM mainframes, most RISC processors and
Motorola processors all use this big endian
method.
Endian Cont’d
However, Intel-based processors use the little endian
method, in which the least significant byte is stored first.
Normally, the programmer does not need to worry about
which format is used, unless
Binary data is transfered between different computers
e.g. over a network.
All TCP/IP headers store integers in big endian format
(called network byte order.)
Binary data is written out to memory as a multibyte
integer and then read back as individual bytes or vise
versa.
Endian Explained
Byte sequence order
Data type Value(*) Big endian Little endian
WORD 1234 12 34 34 12
DWORD 47D5A8 00 47 d5 a8 a8 d5 47 00
DWORD 56789ABC 56 78 9a bc bc 9a 78 56
Endian Explained
Abbreviated Data Allocation
Multiple definitions can be abbreviated.
For example,
message DB 'B'
DB 'y'
DB 'e'
DB 0DH
DB 0AH
can be written as
message DB 'B', 'y', 'e', 0DH, 0AH
and even more compactly as
message DB 'Bye', 0DH, 0AH
Multi-bytes
Multiple definitions can be cumbersome to initialize
data structures such as arrays
For example, to declare and initialize an integer array
of 8 elements
values DW 0, 0, 0, 0, 0, 0, 0, 0
What if we want to declare and initialize to zero an
array of a lot more elements?
Assembler provides a better way of doing this by DUP
directive:
values DW 8 DUP (0)
Symbol Table
For multiple data directives Assembler builds a
symbol table
Both offset (in bytes) and label refer to the
allocated storage space in memory:
Label Memory
Name offset
.DATA
Value 0
value DW 0
sum 2
sum DD 0
marks 6
marks DW 10 DUP (?)
message 26
message DB 'The grade is:',0
char1 DB ? char1 40
Similarity with C Data Type
Directive C Data type
DB Char
DW Int, Unsigned int
DD Float, long
DQ double
DT Internal intermediate float
value
Keyword Description
Allocates unsigned numbers from 0 to
BYTE, DB (byte)
255.
Allocates signed numbers from 128 to
SBYTE (signed byte)
+127.
Allocates unsigned numbers from 0 to
WORD, DW (word = 2 bytes)
65,535 (64K).
Allocates signed numbers from 32,768
SWORD (signed word)
to +32,767.
Allocates unsigned numbers from 0 to
DWORD, DD (doubleword = 4 bytes)
4,294,967,295 (4 megabytes)
Allocates signed numbers from
SDWORD (signed doubleword)
2,147,483,648 to +2,147,483,647.
Allocates 6-byte (48-bit) integers.
These values are normally used only as
FWORD, DF (farword = 6 bytes)
pointer variables on the 80386/486
processors.
Allocates 8-byte integers used with
QWORD, DQ (quadword = 8 bytes)
8087-family coprocessor instructions.
Allocates 10-byte (80-bit) integers if the
TBYTE, DT (10 bytes) initializer has a radix specifying the
base of the number.
Data Storage in register
• The data types SBYTE, SWORD, and SDWORD tell
the assembler to treat the initializers as signed
data.
Pointer Operator
CPU has instructions to copy, move, and sign-extend integer values.
These instructions require operands to be the same size.
However, we may need to operate on data with size other than that
originally declared.
The PTR operator forces expression to be treated as the specified
type:
.DATA
num DWORD 0
.CODE
mov ax, WORD PTR num[0] ; Load a word-size value from
mov dx, WORD PTR num[2] ; a doubleword variable
PTR operator re-casts the DWORD-sized memory location pointed by
num[ index ] expression into a WORD-sized value.
Copying Data Values
The primary instructions for
moving data from operand to
operand and loading them into
registers are
MOV (Move)
XCHG (Exchange)
CWD (Convert Word to Double)
CBW (Convert Byte to Word).
MOV Instruction
MOV copies the source operand to the destination operand without
affecting the source.
; Immediate value moves
mov ax, 7 ; Immediate to register
mov mem, 7 ; Immediate to memory direct
mov mem[bx], 7 ; Immediate to memory indirect
; Register moves
mov mem, ax ; Register to memory direct
mov mem[bx], ax ; Register to memory indirect
mov ax, bx ; Register to register
mov ds, ax ; General register to segment register
; Direct memory moves
mov ax, mem ; Memory direct to register
mov ds, mem ; Memory to segment register
; Indirect memory moves
mov ax, mem[bx] ; Memory indirect to register
mov ds, mem[bx] ; Memory indirect to segment register
MOV Instructions
; Segment register moves
mov mem, ds ; Segment register to memory
mov mem[bx], ds ; Segment register to memory indirect
mov ax, ds ; Segment register to general register
; Move immediate to segment register
mov ax, DGROUP ; Load AX with immediate value
mov ds, ax ; Copy AX to segment register
; Move memory to memory
mov ax, mem1 ; Load AX with memory value
mov mem2, ax ; Copy AX to other memory
; Move segment register to segment register
mov ax, ds ; Load AX with segment register
mov es, ax ; Copy AX to segment register
Exchange Instructions
The XCHG (exchange data) instruction exchanges
the contents of two operands.
There are three variants:
XCHG reg, reg
XCHG reg, mem
XCHG mem, reg
Exchange Instructions
You can exchange data between registers or between
registers and memory, but not from memory to memory:
xchg ax, bx ; Put AX in BX and BX in AX
xchg memory, ax ; Put "memory" in AX and AX in
"memory"
xchg mem1, mem2 ; Illegal, can't exchange memory
locations!
The rules for operands in the XCHG instruction are the same
as those for the MOV instruction...
NB: except that XCHG does not accept immediate operands.
Exchange Instructions
In array sorting applications, XCHG provides a simple way to
exchange two array elements.
Few more examples using XCHG:
xchg ax, bx ; exchange 16-bit regs
xchg ah, al ; exchange 8-bit regs
xchg eax, ebx ; exchange 32-bit regs
xchg [response], cl ; exchange 8-bit mem op with CL
xchg [total], edx ; exchange 32-bit mem op with EDX
Without the XCHG instruction, we need a temporary register
to exchange values if using only the MOV instruction.
Memory to Memory Exchange
To exchange two memory operands, use a register as a
temporary container and combine MOV with XCHG. For
example,
.DATA
val1 WORD 1000h
val2 WORD 2000h
.CODE
mov ax, [val1] ; AX = 1000h
xchg ax, [val2] ; AX = 2000h, val2 =
1000h
mov [val1], ax ; val1 = 2000h
Byte Swap
The XCHG instruction is useful for conversion of 16-bit
data between little endian and big endian forms.
xchg al, ah
For example, the following XCHG converts the data in
AX into the other endian form.
Pentium provides BSWAP instruction to do similar
conversion on 32-bit data:
BSWAP 32-bit register
NB: BSWAP works only on data located in a 32-bit
register.
Byte Swap
BSWAP swaps bytes of its operand. For example,
bswap eax
• Result is EAX
Signed and Unsigned Integers
Since moving data between registers of different
sizes is illegal, you must sign-extend integers to
convert signed data to a larger size.
Sign-extending means copying the sign bit of the
unextended operand to all bits of the operand's
next larger size.
This widens the operand while maintaining its sign
and value.
The four instructions presented below act only on
the accumulator register (AL, AX, or EAX)
Signed and Unsigned Integers
Instruction Sign-extend
CBW (convert byte to word) AL to AX
CWD (convert word to
AX to DX:AX
doubleword)
CWDE (convert word to
AX to EAX
doubleword extended)
CDQ (convert doubleword to
EAX to EDX:EAX
quadword)
Signed Values
.DATA
mem8 SBYTE -5
mem16 SWORD +5
mem32 SDWORD -5
.CODE
.
.
.
mov al, mem8 ; Load 8-bit -5 (FBh)
cbw ; Convert to 16-bit -5 (FFFBh) in AX
mov ax, mem16 ; Load 16-bit +5
cwd ;Convert to 32-bit +5 (0000:0005h) in DX:AX
mov ax, mem16 ; Load 16-bit +5
cwde ; Convert to 32-bit +5 (00000005h) in EAX
mov eax, mem32 ; Load 32-bit -5 (FFFFFFFBh)
cdq ; Convert to 64-bit -5 (FFFFFFFF:FFFFFFFBh) in EDX:EAX
Signed Values
Sign extending instructions efficiently convert unsigned values as well,
provided the sign bit is zero.
This example, for instance, correctly widens mem16 whether you treat
the variable as signed or unsigned.
The processor does not differentiate between signed and unsigned
values.
For instance, the value of mem8 in the previous example is literally 251
(0FBh) to the processor.
It ignores the human convention of treating the highest bit as an
indicator of sign.
The processor can ignore the distinction between signed and unsigned
numbers because binary arithmetic works the same in either case.
The programmer, not the processor, must keep track of which values
are signed or unsigned, and treat them accordingly.
Unsigned Values
If sign extension was not what you had in mind, that is, if you
need to extend the unsigned value, explicitly set the higher
register to zero:
.DATA
mem8 BYTE 251
mem16 WORD 251
.CODE
.
.
.
mov al, mem8 ; Load 251 (FBh) from 8-bit memory
sub ah, ah ; Zero upper half (AH)
mov ax, mem16 ; Load 251 (FBh) from 16-bit memory
sub dx, dx ; Zero upper half (DX)
sub eax, eax ; Zero entire extended register (EAX)
mov ax, mem16 ; Load 251 (FBh) from 16-bit memory