0% found this document useful (0 votes)
7 views20 pages

Introduction To Floating Point Representation: CS453 Computer System Design

The document provides an overview of floating-point representation in computer systems, detailing the differences between integers, fixed-point, and floating-point numbers. It explains the IEEE 754 standard for single precision floating-point numbers, including the formats for representation, conversion methods, and special cases like positive and negative zero and infinity. Additionally, it covers operations such as addition, subtraction, and multiplication of floating-point numbers.

Uploaded by

raghavmour
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views20 pages

Introduction To Floating Point Representation: CS453 Computer System Design

The document provides an overview of floating-point representation in computer systems, detailing the differences between integers, fixed-point, and floating-point numbers. It explains the IEEE 754 standard for single precision floating-point numbers, including the formats for representation, conversion methods, and special cases like positive and negative zero and infinity. Additionally, it covers operations such as addition, subtraction, and multiplication of floating-point numbers.

Uploaded by

raghavmour
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CS453 Computer System Design

Introduction to Floating Point Representation

John Jose
Associate Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Integers and fixed-point numbers
• Integers: the universe is infinite but discrete
• No numbers between consecutive integers, e.g., 5 and 6
• A countable (finite) number of items in a finite range
• Referred to as fixed-point numbers
Real Numbers and floating-point numbers
• Real numbers – the universe is infinite and continuous
• Fractions represented by decimal notation
• Rational numbers, e.g., 5/2 = 2.5
• Irrational numbers, e.g., 22/7 = 3.14159265 . . .
• Infinite numbers exist even in the smallest range
• Referred to as floating-point numbers
• A large number: 976,000,000,000,000 = 9.76 × 1014
• A small number: 0.0000000000000976 = 9.76 × 10 –14
Standard Scientific Notation
• Decimal numbers
• 0.513×105, 5.13×104 and 51.3×103 are written in
scientific notation.
• 5.13×104 is the normalized scientific notation.
• Binary numbers
• Base 2
• Binary point – multiplication by 2 moves the point to
the right.
• Normalized scientific notation, e.g., 1.02×2 –1
Floating Point Numbers
• General format : ±1.bbbbb2×2eeee
or (-1)S × (1+F) × 2E
• Where
• S = sign, 0 for positive, 1 for negative
• F = fraction (or mantissa) as a binary integer,
1+F is called significand
• E = exponent as a binary integer, positive or negative (two’s
complement)
Numbers in 32-bit Formats
• Two’s complement integers
Expressible numbers
-231 0 231-1

• Floating point numbers

Negative underflow Positive underflow

–∞ -0 +0
+∞
Negative Expressible Expressible Positive
Overflow negative positive Overflow
numbers numbers
0
IEEE 754 Floating Point Standards
IEEE 754 Floating Point Standard
• Single Precision Floating point numbers

• Biased exponent: true exponent range


• [-126,127] is changed to [1, 254]
• Biased exponent is an 8-bit positive binary integer.
• True exponent obtained by subtracting 127ten or 01111111two
• First bit of significand is always 1
• ± [Link] . . . b × 2E
• 1 before the binary point is implicitly assumed.
• Significand field represents 23-bit fraction after the binary point.
IEEE 754 Floating Point Standard
• Single Precision Floating point numbers
normalized E F
bits 23-30 bits 0-22
Sign bit S
1 1011001 01001100000000010001101
Positive integer – 127 = E

Negative underflow Positive underflow


–∞ –0 +0
+0 +∞
Negative Expressible Expressible Positive
Overflow negative positive Overflow
numbers -2-126 0 2-126 numbers
Decimal Fraction to Binary (IEEE 754) conversion
• Represent 85.125 in IEEE 754 format

• 85 = 1010101 : 0.125 = 001

• 85.125 = 1010101.001 = 1.010101001 x 26 [sign = 0]

• Biased exponent 127+6=133→= 10000101

• Normalised mantissa = 010101001 (we will add 0's to complete the 23 bits)

• The IEEE 754 Single precision is: = 0 10000101 01010100100000000000000


• Hexadecimal form 0100,0010,1010,1010,0100,0000,0000,0000
42AA4000
Binary to Decimal Fraction Conversion
Binary (-1)S (1.b1b2b3b4) × 2E

Decimal (-1)S × (1 + b1×2-1 + b2×2-2 + b3×2-3 + b4×2-4) × 2E

Example: -1.1100 × 2-2 (binary)


= - (1 + 2-1 + 2-2) ×2-2

= - (1 + 0.5 + 0.25)/4

= - 1.75/4

= - 0.4375 (decimal)
Conversion From Hex to Decimal
• R1= 0x42220000
0 100 0010 0010 0010 0000 0000 0000 0000
→ E’= 100 0010 0 →132 → E=132-127=5 → 25
+ 1.0100010 x 25 = 101000.10 = +40.5
• R2=0xC12E0000
1 100 0001 0010 1110 0000 0000 0000 0000
→ E’= 100 0001 0 →130 → E=130-127=3 → 23
= - 1.0101110 x 23 = 1010 = -10 + (0.5+0.25+0.125) = -10.875
• R3=0xC0800000
1 100 0000 1000 0000 0000 0000 0000 0000
= - 1.00 x 22 = 0100 = -4
Positive Zero in IEEE 754
0 00000000 00000000000000000000000
Biased Fraction
exponent

• + 1.0 × 2 –127
• Smaller than the smallest positive number in single-precision IEEE 754
standard.
• Interpreted as positive zero.
• True exponent less than –126 is positive underflow
Negative Zero in IEEE 754
1 00000000 00000000000000000000000
Biased Fraction
exponent
• – 1.0 × 2 –127
• Greater than the largest negative number in single-precision IEEE 754
standard.
• Interpreted as negative zero.
• True exponent less than –126 is negative underflow
Positive Infinity in IEEE 754
0 11111111 00000000000000000000000
Biased Fraction
exponent
• + 1.0 × 2128
• Greater than the largest positive number in single-precision IEEE 754
standard.
• Interpreted as + ∞
• If true exponent > 127, then the number is greater than ∞. It is called “not
a number” or NaN and may be interpreted as ∞.
Negative Infinity in IEEE 754
1 11111111 00000000000000000000000
Biased Fraction
exponent
• –1.0 × 2128
• Smaller than the smallest negative number in single-precision IEEE 754
standard.
• Interpreted as - ∞
• If true exponent > 127, then the number is less than - ∞. It is called “not a
number” or NaN and may be interpreted as - ∞.
FP Addition and Subtraction
1. Significand alignment: Right shift significand of smaller exponent until
two exponents match.

2. Addition: Add significands and report error if overflow occurs.


If significand = 0, return result as 0.

3. Normalization

-Shift significand bits to normalize.

- report overflow or underflow if exponent goes out of range.

4. Rounding
Example (4 Significant Fraction Bits)
• Subtraction: 0.5ten – 0.4375ten
• Floating point numbers to be added
1.000two× 2 –1 and –1.110two× 2 –2
• Significand of lesser exponent is shifted right until exponents match
–1.110two× 2 –2 → – 0.111two× 2 –1 01000
• Add significands, 1.000two + ( – 0.111two) +11001
Result is 0.001two × 2 –1 00001
2’s complement addition,
• Normalize, 1.000two× 2 – 4 one bit added for sign
No overflow/underflow since
127 ≥ exponent ≥ –126
1.000two × 2 – 4 = (1+0)/16 = 0.0625ten
FP Multiplication

1. Separate sign
2. Add exponents (integer addition)
3. Multiply significands (integer multiplication)
4. Normalize, round, check overflow/underflow
5. Replace sign
johnjose@[Link]
[Link]

You might also like