COMPUTER ORGANIZATION AND DESIGN RISC-V
Edition
The Hardware/Software Interface
Chapter 3
Arithmetic for Computers
§3.1 Introduction
3.1 Arithmetic for Computers
n Operations on integers
n Addition and subtraction
n Multiplication and division
n Dealing with overflow
n Floating-point real numbers
n Representation and operations
Chapter 3 — Arithmetic for Computers — 2
§3.2 Addition and Subtraction
3.2 Integer Addition
n Example: 7 + 6
n Overflow if result out of range
n Adding +ve and –ve operands, no overflow
n Adding two +ve operands
n Overflow if result sign is 1
n Adding two –ve operands
n Overflow if result sign is 0
Chapter 3 — Arithmetic for Computers — 3
Integer Subtraction
n Add negation of second operand
n Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111
–6: 1111 1111 … 1111 1010
+1: 0000 0000 … 0000 0001
n Overflow if result out of range
n Subtracting two +ve or two –ve operands, no overflow
n Subtracting +ve from –ve operand
n Overflow if result sign is 0
n Subtracting –ve from +ve operand
n Overflow if result sign is 1
Chapter 3 — Arithmetic for Computers — 4
Arithmetic for Multimedia(A)
n Graphics and media processing operates
on vectors of 8-bit and 16-bit data
n Use 64-bit adder, with partitioned carry chain
n Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
n SIMD (single-instruction, multiple-data)
n Saturating operations
n On overflow, result is largest representable
value
n c.f. 2s-complement modulo arithmetic
n E.g., clipping in audio, saturation in video
Chapter 3 — Arithmetic for Computers — 5
§3.3 Multiplication
3.3 Multiplication(A)
n Start with long-multiplication approach
multiplicand
1000
multiplier
× 1001
1000
0000
0000
1000
product 1001000
Length of product is
the sum of operand
lengths
Chapter 3 — Arithmetic for Computers — 6
Multiplication Hardware
Initially 0
Chapter 3 — Arithmetic for Computers — 7
Optimized Multiplier
n Perform steps in parallel: add/shift
n One cycle per partial-product addition
n That’s ok, if frequency of multiplications is low
Chapter 3 — Arithmetic for Computers — 8
Faster Multiplier
n Uses multiple adders
n Cost/performance tradeoff
n Can be pipelined
n Several multiplication performed in parallel
Chapter 3 — Arithmetic for Computers — 9
RISC-V Multiplication
n Four multiply instructions:
n mul: multiply
n Gives the lower 64 bits of the product
n mulh: multiply high
n Gives the upper 64 bits of the product, assuming the
operands are signed
n mulhu: multiply high unsigned
n Gives the upper 64 bits of the product, assuming the
operands are unsigned
n mulhsu: multiply high signed/unsigned
n Gives the upper 64 bits of the product, assuming one
operand is signed and the other unsigned
n Use mulh result to check for 64-bit overflow
Chapter 3 — Arithmetic for Computers — 10
3.4 RISC-V Division
n Four instructions:
n div, rem: signed divide, remainder
n divu, remu: unsigned divide, remainder
n Overflow and division-by-zero don’t
produce errors
n Just return defined results
n Faster for the common case of no error
Chapter 3 — Arithmetic for Computers — 11
§3.5 Floating Point
3.5 Floating Point
n Representation for non-integral numbers
n Including very small and very large numbers
n Like scientific notation
n –2.34 × 1056 normalized
n +0.002 × 10–4 not normalized
n +987.02 × 109
n In binary
n ±1.xxxxxxx2 × 2yyyy
n Types float and double in C
Chapter 3 — Arithmetic for Computers — 12
Floating Point Standard
n Defined by IEEE Std 754-1985
n Developed in response to divergence of
representations
n Portability issues for scientific code
n Now almost universally adopted
n Two representations
n Single precision (32-bit)
n Double precision (64-bit)
Chapter 3 — Arithmetic for Computers — 13
IEEE Floating-Point Format
single: 8 bits single: 23 bits
double: 11 bits double: 52 bits
S Exponent Fraction
x ( 1)S (1 Fraction) 2(Exponent Bias)
n S: sign bit (0 non-negative, 1 negative)
n Normalize significand: 1.0 ≤ |significand| < 2.0
n Always has a leading pre-binary-point 1 bit, so no need to
represent it explicitly (hidden bit)
n Significand is Fraction with the “1.” restored
n Exponent: excess representation: actual exponent + Bias
n Ensures exponent is unsigned
n Single: Bias = 127; Double: Bias = 1203
Chapter 3 — Arithmetic for Computers — 14
Single-Precision Range
n Exponents 00000000 and 11111111 reserved
n Smallest value
n Exponent: 00000001
actual exponent = 1 – 127 = –126
n Fraction: 000…00 significand = 1.0
n ±1.0 × 2–126 ≈ ±1.2 × 10–38
n Largest value
n exponent: 11111110
actual exponent = 254 – 127 = +127
n Fraction: 111…11 significand ≈ 2.0
n ±2.0 × 2+127 ≈ ±3.4 × 10+38
Chapter 3 — Arithmetic for Computers — 15
Double-Precision Range
n Exponents 0000…00 and 1111…11 reserved
n Smallest value
n Exponent: 00000000001
actual exponent = 1 – 1023 = –1022
n Fraction: 000…00 significand = 1.0
n ±1.0 × 2–1022 ≈ ±2.2 × 10–308
n Largest value
n Exponent: 11111111110
actual exponent = 2046 – 1023 = +1023
n Fraction: 111…11 significand ≈ 2.0
n ±2.0 × 2+1023 ≈ ±1.8 × 10+308
Chapter 3 — Arithmetic for Computers — 16
Floating-Point Precision
n Relative precision
n all fraction bits are significant
n Single: approx 2–23
n Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal
digits of precision
n Double: approx 2–52
n Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal
digits of precision
Chapter 3 — Arithmetic for Computers — 17
Floating-Point Example
n Represent –0.75
n –0.75 = (–1)1 × 1.12 × 2–1
n S=1
n Fraction = 1000…002
n Exponent = –1 + Bias
n Single: –1 + 127 = 126 = 011111102
n Double: –1 + 1023 = 1022 = 011111111102
n Single: 1011111101000…00
n Double: 1011111111101000…00
Chapter 3 — Arithmetic for Computers — 18
Floating-Point Example
n What number is represented by the single-
precision float
11000000101000…00
n S=1
n Fraction = 01000…002
n Fxponent = 100000012 = 129
n x = (–1)1 × (1 + 012) × 2(129 – 127)
= (–1) × 1.25 × 22
= –5.0
Chapter 3 — Arithmetic for Computers — 19
Denormal Numbers
n Exponent = 000...0 hidden bit is 0
x ( 1)S (0 Fraction) 2Bias
n Smaller than normal numbers
n allow for gradual underflow, with
diminishing precision
n Denormal with fraction = 000...0
x ( 1)S (0 0) 2Bias 0.0
Two representations
of 0.0!
Chapter 3 — Arithmetic for Computers — 20
Infinities and NaNs
n Exponent = 111...1, Fraction = 000...0
n ±Infinity
n Can be used in subsequent calculations,
avoiding need for overflow check
n Exponent = 111...1, Fraction ≠ 000...0
n Not-a-Number (NaN)
n Indicates illegal or undefined result
n e.g., 0.0 / 0.0
n Can be used in subsequent calculations
Chapter 3 — Arithmetic for Computers — 21
Floating-Point Addition
n Consider a 4-digit decimal example
n 9.999 × 101 + 1.610 × 10–1
n 1. Align decimal points
n Shift number with smaller exponent
n 9.999 × 101 + 0.016 × 101
n 2. Add significands
n 9.999 × 101 + 0.016 × 101 = 10.015 × 101
n 3. Normalize result & check for over/underflow
n 1.0015 × 102
n 4. Round and renormalize if necessary
n 1.002 × 102
Chapter 3 — Arithmetic for Computers — 22
Floating-Point Addition
n Now consider a 4-digit binary example
n 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
n 1. Align binary points
n Shift number with smaller exponent
n 1.0002 × 2–1 + –0.1112 × 2–1
n 2. Add significands
n 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
n 3. Normalize result & check for over/underflow
n 1.0002 × 2–4, with no over/underflow
n 4. Round and renormalize if necessary
n 1.0002 × 2–4 (no change) = 0.0625
Chapter 3 — Arithmetic for Computers — 23
FP Adder Hardware
n Much more complex than integer adder
n Doing it in one clock cycle would take too
long
n Much longer than integer operations
n Slower clock would penalize all instructions
n FP adder usually takes several cycles
n Can be pipelined
Chapter 3 — Arithmetic for Computers — 24
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
Chapter 3 — Arithmetic for Computers — 25
Floating-Point Multiplication
n Consider a 4-digit decimal example
n 1.110 × 1010 × 9.200 × 10–5
n 1. Add exponents
n For biased exponents, subtract bias from sum
n New exponent = 10 + –5 = 5
n 2. Multiply significands
n 1.110 × 9.200 = 10.212 10.212 × 105
n 3. Normalize result & check for over/underflow
n 1.0212 × 106
n 4. Round and renormalize if necessary
n 1.021 × 106
n 5. Determine sign of result from signs of operands
n +1.021 × 106
Chapter 3 — Arithmetic for Computers — 26
Floating-Point Multiplication
n Now consider a 4-digit binary example
n 1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375)
n 1. Add exponents
n Unbiased: –1 + –2 = –3
n Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127
n 2. Multiply significands
n 1.0002 × 1.1102 = 1.1102 1.1102 × 2–3
n 3. Normalize result & check for over/underflow
n 1.1102 × 2–3 (no change) with no over/underflow
n 4. Round and renormalize if necessary
n 1.1102 × 2–3 (no change)
n 5. Determine sign: +ve × –ve –ve
n –1.1102 × 2–3 = –0.21875
Chapter 3 — Arithmetic for Computers — 27
FP Arithmetic Hardware
n FP multiplier is of similar complexity to FP
adder
n But uses a multiplier for significands instead of
an adder
n FP arithmetic hardware usually does
n Addition, subtraction, multiplication, division,
reciprocal, square-root
n FP integer conversion
n Operations usually takes several cycles
n Can be pipelined
Chapter 3 — Arithmetic for Computers — 28
FP Instructions in RISC-V
n Separate FP registers: f0, …, f31
n double-precision
n single-precision values stored in the lower 32 bits
n FP instructions operate only on FP registers
n Programs generally don’t do integer ops on FP data,
or vice versa
n More registers with minimal code-size impact
n FP load and store instructions
n flw, fld
n fsw, fsd
Chapter 3 — Arithmetic for Computers — 29
FP Instructions in RISC-V
n Single-precision arithmetic
n fadd.s, fsub.s, fmul.s, fdiv.s, fsqrt.s
n e.g., fadds.s f2, f4, f6
n Double-precision arithmetic
n fadd.d, fsub.d, fmul.d, fdiv.d, fsqrt.d
n e.g., fadd.d f2, f4, f6
n Single- and double-precision comparison
n feq.s, flt.s, fle.s
n feq.d, flt.d, fle.d
n Result is 0 or 1 in integer destination register
n Use beq, bne to branch on comparison result
n Branch on FP condition code true or false
n [Link]
Chapter 3 — Arithmetic for Computers — 30
Accurate Arithmetic
n IEEE Std 754 specifies additional rounding
control
n Extra bits of precision (guard, round, sticky)
n Choice of rounding modes
n Allows programmer to fine-tune numerical behavior of
a computation
n Not all FP units implement all options
n Most programming languages and FP libraries just
use defaults
n Trade-off between hardware complexity,
performance, and market requirements
Chapter 3 — Arithmetic for Computers — 31
§3.9 Fallacies and Pitfalls
Right Shift and Division
n Left shift by i places multiplies an integer
by 2i
n Right shift divides by 2i?
n Only for unsigned integers
n For signed integers
n Arithmetic right shift: replicate the sign bit
n e.g., –5 / 4
n 111110112 >> 2 = 111111102 = –2
n Rounds toward –∞
n c.f. 111110112 >>> 2 = 001111102 = +62
Chapter 3 — Arithmetic for Computers — 32
§3.6 Parallelism and Computer Arithmetic: Subword Parallelism
3.6 Subword Parallellism
n Graphics and audio applications can take
advantage of performing simultaneous
operations on short vectors
n Example: 128-bit adder:
n Sixteen 8-bit adds
n Eight 16-bit adds
n Four 32-bit adds
n Also called data-level parallelism, vector
parallelism, or Single Instruction, Multiple
Data (SIMD)
Chapter 3 — Arithmetic for Computers — 33
§3.10 Concluding Remarks
Concluding Remarks
n Bits have no inherent meaning
n Interpretation depends on the instructions
applied
n Computer representations of numbers
n Finite range and precision
n Need to account for this in programs
Chapter 3 — Arithmetic for Computers — 34
Concluding Remarks
n ISAs support arithmetic
n Signed and unsigned integers
n Floating-point approximation to reals
n Bounded range and precision
n Operations can overflow and underflow
Chapter 3 — Arithmetic for Computers — 35