0% found this document useful (0 votes)
9 views31 pages

Chapiter 3

The document provides an overview of the MIPS R3000 microprocessor, detailing its architecture, components, and operation. It explains the role of the CPU, including registers, the Arithmetic and Logic Unit (ALU), and instruction decoding, as well as the instruction set architecture (ISA) and various instruction formats. Additionally, it covers concepts such as RISC vs. CISC, addressing modes, and pipelining in instruction execution.

Uploaded by

ytgf7024
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views31 pages

Chapiter 3

The document provides an overview of the MIPS R3000 microprocessor, detailing its architecture, components, and operation. It explains the role of the CPU, including registers, the Arithmetic and Logic Unit (ALU), and instruction decoding, as well as the instruction set architecture (ISA) and various instruction formats. Additionally, it covers concepts such as RISC vs. CISC, addressing modes, and pipelining in instruction execution.

Uploaded by

ytgf7024
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COMPUTER ARCHITECTURE

2nd Year Computer science


Chapiter 3:

PROCESSOR
(The MIPS R3000 microprocessor)

Abdelhafid Boussouf University Center 1


2024-2025

MICROPROCESSORS
A program stored in the memory provides instructions to the CPU to perform a specific
action. This action can be a simple addition. It is function of the CPU to fetch the program
instructions from the memory and execute them.

 The CPU contains a number of registers to store information inside the CPU temporarily.
Registers inside the CPU can be 8-bit, 16-bit, 32-bit or even 64-bit depending on the
CPU.
 The CPU also contains Arithmetic and Logic Unit (ALU). The ALU performs arithmetic
(add, subtract, multiply, divide) and logic (AND, OR, NOT) functions.
 The CPU contains a program counter also known as the Instruction Pointer to
point the address of the next instruction to be executed.
 Instruction Decoder is a kind of dictionary which is used to interpret the meaning of the
instruction fetched into the CPU. Appropriate control signals are generated according to the
meaning of the instruction. 2

1
MICROPROCESSORS
Inside the CPU:
Address Bus

Instruction Pointer

Instruction Register
Instruction Decoder Timing Control Bus
and control signals are
Flags
ALU generated

Data Bus

Internal Register A
Busses Register B
Register C
Register D

Internal block diagram of a CPU

CPU Clocking
Clock period
Clock (cycles)
Data transfer
and computation
Update state

Operation of digital hardware governed by a constant-


rate clock
Clock period: duration of a clock cycle
e.g., 250 ps = 0.25 ns = 250×10–12 s
Clock frequency (rate): cycles per second
e.g., 4.0 GHz = 4000 MHz = 4.0×109 Hz 4

2
CPU Time
Instructio ns Clock cycles Seconds
CPU Time   
Program Instructio n Clock cycle

Performance depends on
 Algorithm: affects IC, possibly CPI
 Programming language: affects IC, CPI
 Compiler: affects IC, CPI
 Instruction set architecture: affects IC, CPI, Tc

Instruction Set Architecture (ISA)


 Critical Interface between software and hardware
 An ISA includes the following …
 Instructions and Instruction Formats
 Data Types, Encodings, and Representations
 Programmable Storage: Registers and Memory
 Addressing Modes: to address Instructions and Data
 Handling Exceptional Conditions (like overflow)
 Examples (Versions) Introduced in
 Intel (8086, 80386, Pentium, Core, ...) 1978
 MIPS (MIPS I, II, …, MIPS32, MIPS64) 1986
6
 ARM (version 1, 2, …) 1985

3
Instruction Set
The words of a computer’s language are called instructions and
the vocabulary of commands understood by a given architecture is
called an instruction set. Common groups of instructions are:

• Arithmetic instructions
• Logical instructions
• Data transfer instructions
• Conditional branch instructions
• Unconditional jump instructions
7

RISC and CISC


🞍 Reduced instruction set computer (RISC)
 means: small number of simple instructions
 example: MIPS

🞍 Complex instruction set computers (CISC)


 means: large number of instructions
 example: Intel’s x86

4
RISC Principles

All instructions are executed by hardware


Maximize the rate at which instructions are issued
Instructions should be easy to decode
Only loads and stores should reference memory
Provide plenty of registers

RISC vs. CISC

10

5
MIPS Microprocessor
MIPS (Microprocessor without Interlocked Pipelined Stages) is a
family of reduced instruction set computer (RISC) instruction set
architectures (ISA) ; developed by MIPS Computer Systems, now MIPS
Technologies, based in the United States.

11

Overview of the MIPS Architecture

12

6
Overview of the MIPS Architecture
...
4 bytes per word Memory
Up to 232 bytes = 230 words
...

EIU $0 Execution & FPU F0 Floating


32 General $1 Integer Unit F1 Point Unit
Purpose $2 (Main proc) F2 (Coproc 1) 32 Floating-Point
Registers Registers
$31 F31
Arithmetic & Integer FP
ALU mul/div
Logic Unit Arith Floating-Point
Hi Lo Arithmetic Unit
TMU BadVaddr Trap &
Status Memory Unit
Cause
Integer (Coproc 0)
EPC
Multiplier/Divider

13

MIPS General-Purpose Registers


 32 General Purpose Registers (GPRs)
 All registers are 32-bit wide in the MIPS 32-bit architecture
 Software defines names for registers to standardize their use
 Assembler can refer to registers by name or by number ($ notation)
Name Register Usage
$zero $0 Always 0 (forced by hardware)
$at $1 Reserved for assembler use
$v0 – $v1 $2 – $3 Result values of a function
$a0 – $a3 $4 – $7 Arguments of a function
$t0 – $t7 $8 – $15 Temporary Values
$s0 – $s7 $16 – $23 Saved registers (preserved across call)
$t8 – $t9 $24 – $25 More temporaries
$k0 – $k1 $26 – $27 Reserved for OS kernel
$gp $28 Global pointer (points to global data)
$sp $29 Stack pointer (points to top of stack)
$fp $30 Frame pointer (points to stack frame) 14
$ra $31 Return address (used by jal for function call)

7
Special-Purpose Registers
-PC (Program Counter), points to the next instruction to be executed
-Hi :High result of multiplication and division operations
-Lo :Low result of multiplication and division operations
-SR (status): Status Register, Contains the interrupt mask and enable bits
-CAUSE : specifies what kind of interrupt or exception just happened.
-EPC : Exception PC, Contains the address of the instruction when the
exception occurred.
-Vaddr: Bad Address Register, Contains the invalid memory address caused by
load, store, or fetch.
15

Instruction Formats
 All instructions are 32-bit wide, Three instruction formats:
 Register (R-Type)
 Register-to-register instructions
 Op: operation code specifies the format of the instruction
Op6 Rs5 Rt5 Rd5 sa5 funct6

 Immediate (I-Type)
 16-bit immediate constant is part in the instruction
Op6 Rs5 Rt5 immediate16

 Jump (J-Type)
 Used by jump instructions
16
Op6 immediate26

8
R-Type Instruction Format
Op6 Rs5 Rt5 Rd5 shamt5 funct6

 Op: operation code (opcode)

 Specifies the operation of the instruction


 Also specifies the format of the instruction

 funct: function code – extends the opcode

 Up to 26 = 64 functions can be defined for the same opcode


 MIPS uses opcode 0 to define many R-type instructions

 Three Register Operands (common to many instructions)

 Rs, Rt: first and second source operands


 Rd: destination operand
 shamt: the shift amount used by shift instructions
17

Encoding MIPS Instructions


Field Opcode

18

9
Encoding MIPS Instructions
Field func when OPCOD = SPECIAL

19

Encoding MIPS Instructions


Examples:

20

10
Encoding MIPS Instructions
Examples:

21

From Assembly to Machine Code


Let’s see an example of a R-format instruction, first as a combination of
decimal numbers and then of binary numbers. Consider the instruction:
add $t0, $s1, $s2

The op and funct fields in combination (0 and 32 in this case) tell that this
instruction performs addition (add).

The rs and rt fields, registers $s1 (17) and $s2 (18), are the source operands,
and the rd field, register $t0 (8), is the destination operand.

The shamt field is unused in this instruction, so it is set to 0. 22

11
From Assembly to Machine Code
Thus, the decimal representation of instruction add $t0, $s1, $s2 is:
• op = 000000 (special)
• rs = 17 ($s1)
• rt = 18 ($s2)
• rd = 8 ($t0)
• shamt = 0 (not used)
• funct = 100000(add)

The binary representation is:

23

Pseudo-Instructions
Most assembler instructions represent machine instructions one-to-one. The
assembler can also treat common variations of machine instructions as if they
were instructions in their own right. Such instructions are called pseudo-
instructions.
The hardware need not implement the pseudo-instructions, but their appearance
in assembly language simplifies programming. Register $at (assembler
temporary) is reserved for this purpose.

blt $s1, $s2, L 🡪 slt $at, $s1, $s2


bne $at, $zero, L
li $s1, 20 🡪 addiu $s1, $zero, 20
move $t0, $t1 🡪 addu $t0, $zero, $t1
24

12
Addressing Modes

MIPS addressing modes are:


1. Immediate addressing where the operand is a constant in the instruction
itself
2. Register addressing where the operand is a register
3. Base or displacement addressing where the operand is at the memory
location whose address is the sum of a register and a constant in the
instruction
4. PC-relative addressing where the branch address is the sum of the PC
with a constant in the instruction
5. Pseudo-direct addressing where the jump address is a constant in
the instruction concatenated with the upper bits of the PC
25

Addressing Modes

26

13
Addressing Modes

27

Addressing Modes
Immediate

op rs rt Immediate
addi $t0, $t1, 5

28

14
Addressing Modes
Register

op rs rt rd funct

Register
add $t0, $t1, $t2

29

Addressing Modes

Base (Arrays, structures, pointers)

op rs rt Address

± Register

lw $t1, 4($s2) Memory


lw $t1, ($s2) #indirect addressing
30

15
Addressing Modes
PC-relative (e.g., conditional branches, need an offset)

op rs rt Address (offset)

beqz $t0, goEnd ± PC

Memory
 adding a 16-bit address shifted left 2 bits to thePC
31

Addressing Modes

Pseudodirect

op Address

concat PC
j for
Memory
Concatenating a 26-bit address shifted left 2 bits with the 4 upper bits of the PC.
32

16
Addressing Modes
Examples:

33

Byte-‐Addressable Memory
🞍 Each data byte has a unique address

🞍 Load/store words or single bytes: load byte (lb) and store byte
(sb)

🞍 Each 32-‐bit words has 4 bytes, so the word address increments


by 4. MIPS uses byte addressable memory
Word Address Data

0000000C 4 0 F 3 0 7 8 8 Word 3
00000008 0 1 EE 2 8 4 2 Word 2
00000004 F 2 F 1 A C 0 7 Word 1
00000000 AB CD E F 7 8 Word 0
34
width = 4 bytes

17
Address Space
The MIPS address space is divided in four segments:
• Text, which contains the program code
• Data, which contains constants and global variables
• Heap, which contains memory dynamically allocated during runtime
• Stack, which contains temporary data for handling procedure calls

The heap and stack segments grow toward each other, thereby allowing the
efficient use of memory as the two segments expand and shrink.

35

Address Space

36

18
Example Program: Executable
Executable file header Text Size Data Size
0x34 (52 bytes) 0xC (12 bytes)

Text segment Address Instruction


0x00400000 0x23BDFFFC
0x00400004 0xAFBF0000
0x00400008 0x20040002
0x0040000C 0xAF848000
0x00400010 0x20050003
0x00400014 0xAF858004
0x00400018 0x0C10000B
0x0040001C 0xAF828008
0x00400020 0x8FBF0000
0x00400024 0x23BD0004
0x00400028 0x03E00008
0x0040002C 0x00851020
0x00400030 0x03E0008

Data segment Address Data


0x10000000 fgy
0x10000004
0x10000008
37

Example Program: In Memory


Address Memory

Reserved
0x7FFFFFFC
Stack $sp = 0x7FFFFFFC

0x10010000 Heap

$gp = 0x10008000

y
g
0x10000000 f

0x03E00008
0x00851020
0x03E00008
0x23BD0004
0x8FBF0000
0xAF828008
0x0C10000B
0xAF858004
0x20050003
0xAF848000
0x20040002
0xAFBF0000
0x00400000 0x23BDFFFC PC = 0x00400000
Reserved
38

19
Pipelining

39

Pipelining: Basic Idea


 More systematically:
 Pipeline the execution of multiple instructions
 Analogy: “Assembly line processing” of instructions

 Idea:
 Divide the instruction processing cycle into distinct “stages” of processing
 Ensure there are enough hardware resources to process one instruction in each stage
 Process a different instruction in each stage
 Instructions consecutive in program order are processed in consecutive stages

 Benefit: Increases instruction processing throughput (1/CPI)

40

20
Example: Execution of Four Independent ADDs
Multi-cycle: 4 cycles per instruction
F D E W
F D E W
F D E W
F D E W
Time
Pipelined: 4 cycles per 4 instructions
1 instruction completed per cycle
F D E W
F D E W
F D E W
F D E W

Time
41

The Laundry Analogy


6 PM 7 8 9 10 11 12 1 2 AM
Time
Task
order
A

 “place one dirtyTime


load of clothes
6 PM 7
in the
8
washer”
9 10 11 12 1 2 AM

 “when the washer


Task
is finished, place the wet load in the dryer”
order
 “when the dryer isAfinished, take out the dry load and fold”
 “when folding is finished,
B ask your roommate (??) to put the clothes away”
C

D 42
Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

21
Pipelining Multiple Loads of Laundry
Time
6 PM 7 8 9 10 11 12 1 2 AM

Task
order
6 PM 7 8 9 10 11 12 1 2 AM
TimeA
Task
B
order
A
C

D
B

6 PM 7 8 9 10 11 12 1 2 AM
Time

Task
order 6 PM 7 8 9 10 11 12 1 2 AM
Time
A
- 4 loads of laundry in parallel
Task
order B
- no additional resources
A
C
- throughput increased by 4
B
D
- latency per load is the same
C
43
D
Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]

Remember: The Instruction Processing Cycle

44

22
Pipelining and ISA Design
MIPS ISA designed for pipelining
 All instructions are 32-bits
 Easier to fetch in one cycle

 Few and regular instruction formats


 Can decode and read registers in one step

 Load/store addressing
 Can calculate address in 3rd stage, access memory in 4th stage

 Alignment of memory operands


45
 Memory access takes only one cycle

Remember: The Instruction Processing Cycle

46

23
Pipelining Abstraction
1 2 3 4 5 6 7 8 9 10

Time (cycles)
$0
lw $s2
lw $s2, 40($0) IM RF 40 + DM RF

$t1
add $s3
add $s3, $t1, $t2 IM RF $t2 + DM RF

$s1
sub $s4
sub $s4, $s1, $s5 IM RF $s5 - DM RF

$t5
and $s5
and $s5, $t5, $t6 IM RF $t6 & DM RF

$s1
sw $s6
sw $s6, 20($s1) IM RF 20 + DM RF

$t3
or $s7
or $s7, $t3, $t4 IM RF $t4 | DM RF

47

Illustrating Pipeline Operation: Operation View

t0 t1 t2 t3 t4 t5
Inst0 IF ID EX MEM WB
Inst1 IF ID EX MEM WB
Inst2 IF ID EX MEM WB
Inst3 IF ID EX MEM WB
Inst4 IF ID EX MEM
IF ID EX
IF ID
steady state
(full pipeline) IF

48

24
Illustrating Pipeline Operation: Resource View

t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10

IF I0 I1 I2 I3 I4 I5 I6 I7 I8 I9 I10

ID I0 I1 I2 I3 I4 I5 I6 I7 I8 I9

EX I0 I1 I2 I3 I4 I5 I6 I7 I8

MEM I0 I1 I2 I3 I4 I5 I6 I7

WB I0 I1 I2 I3 I4 I5 I6

49

Instruction Pipeline: Not An Ideal Pipeline


Identical operations ... NOT!
 different instructions  not all need the same stages
Forcing different instructions to go through the same pipe stages
 external fragmentation (some pipe stages idle for some instructions)

Uniform suboperations ... NOT!


 different pipeline stages  not the same latency
Need to force each stage to be controlled by the same clock
 internal fragmentation (some pipe stages are too fast but all take the same clock cycle time)

Independent operations ... NOT!


 instructions are not independent of each other
Need to detect and resolve inter-instruction dependences to ensure the pipeline provides
correct results
 pipeline stalls (pipeline is not always moving)
50

25
Pipelining Hazards

51

Structural hazard
Caused by limitations in hardware that don’t allow
concurrent execution of different instructions
Examples
 Bus
 Single ALU
 Single Memory for instructions and data
 Single IR
Remedy is to add additional elements to datapath to
eliminate hazard
52

26
Data Hazards Types
 An instruction depends on completion of data access by a
previous instruction
r3  r1 op r2 Read-after-Write
r5  r3 op r4 (RAW)

r3  r1 op r2 Write-after-Read
r1  r4 op r5 (WAR)

r3  r1 op r2 Write-after-Write
r5  r3 op r4 (WAW)
53
r3  r6 op r7

Control Hazard
Special case of data dependence: dependence on PC
beq:
 branch is not determined until the fourth stage of the pipeline
 Instructions after the branch are fetched before branch is
resolved
 Always predict that the next sequential instruction is fetched
 Called “Always not taken” prediction
 These instructions must be flushed if the branch is taken
Branch misprediction penalty
 number of instructions flushed when branch is taken
 May be reduced by determining branch earlier
54

27
Control Hazard
1 2 3 4 5 6 7 8 9

Time (cycles)
$t1
lw DM
20 beq $t1, $t2, 40 IM RF $t2 - RF

$s0
and DM
24 and $t0, $s0, $s1 IM RF $s1 & RF
Flush
$s4 these
or DM instructions
28 or $t1, $s4, $s0 IM RF $s0 | RF

$s0
sub DM
2C sub $t2, $s0, $s5 IM RF $s5 - RF

30 ...
...
$s2
slt DM $t3

slt
64 slt $t3, $s2, $s3 IM RF $s3 RF

55

Causes of Pipeline Stalls


Stall: A condition when the pipeline stops moving

Resource contention

Dependences (between instructions)


 Data hazard
 Control hazard

Long-latency (multi-cycle) operations

56

28
How Can You Handle Data Hazards?
■ Insert “NOP”s (No OPeration) in code at compile time

■ Rearrange code at compile time

■ Forward data at run time

■ Stall the processor at run time

57

Data Forwarding/Bypassing
 Problem: A consumer (dependent) instruction has to wait in decode
stage until the producer instruction writes its value in the register file
 Goal: We do not want to stall the pipeline unnecessarily
 Observation: The data value needed by the consumer instruction can
be supplied directly from a later stage in the pipeline (instead of only
from the register file)
 Idea: Add additional dependence check logic and data forwarding
paths (buses) to supply the producer’s value to the consumer right
after the value is available
 Benefit: Consumer can move in the pipeline until the point the value
can be supplied  less stalling
58

29
Data Forwarding/Bypassing
 Use result when it is computed
 Don’t wait for it to be stored in a register
 Requires extra connections in the datapath

No bubble!

4-3=1, no stall!

59

Data Forwarding/Bypassing
 Can’t always avoid stalls by forwarding
 If value not computed when needed
 Can’t forward backward in time!

60

30
Code Scheduling to Avoid Stalls
 Reorder code to avoid use of load result in the next
instruction
 C code for A = B + E; C = B + F;

lw $t1, 0($t0) lw $t1, 0($t0)


lw $t2, 4($t0) lw $t2, 4($t0)
add $t3, $t1, $t2 lw $t4, 8($t0)
stall
sw $t3, 12($t0) add $t3, $t1, $t2
lw $t4, 8($t0) sw $t3, 12($t0)
stall
add $t5, $t1, $t4 add $t5, $t1, $t4
sw $t5, 16($t0) sw $t5, 16($t0)
13 cycles 11 cycles
61

Stall on Branch
 In MIPS pipeline
 Need to compare registers and compute target earlier in the pipeline
 Add extra hardware to do it in ID stage (earliest ? )

 Wait until branch outcome determined before fetching next


instruction
 1 bubble when determine in ID
 Is no stall possible? IF, prediction

62

31

You might also like