0% found this document useful (0 votes)
6 views30 pages

Computer Architecture Overview

The document outlines the fundamentals of computer architecture and organization, focusing on components, functions, and the instruction cycle. It details the von Neumann architecture, the roles of the CPU, memory, and I/O components, as well as the processes involved in instruction fetching, execution, and handling interrupts. Additionally, it discusses the importance of interrupts in managing I/O operations and ensuring efficient program execution.

Uploaded by

refsen00
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views30 pages

Computer Architecture Overview

The document outlines the fundamentals of computer architecture and organization, focusing on components, functions, and the instruction cycle. It details the von Neumann architecture, the roles of the CPU, memory, and I/O components, as well as the processes involved in instruction fetching, execution, and handling interrupts. Additionally, it discusses the importance of interrupts in managing I/O operations and ensuring efficient program execution.

Uploaded by

refsen00
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CSE309 - Computer

Architecture and Organization


Computers Systems: A Top-Level View
Outline
1. Computer Components
2. Computer Function
Instruction Fetch and Execute
Interrupts
6. PCI Express
I/O Function
PCI Physical and Logical Architecture
3. Interconnection Structures
PCIe Physical Layer
4. Bus Interconnection
PCIe Transaction Layer
5. Point-to-Point Interconnect
PCIe Data Link Layer
QPI Physical Layer
QPI Link Layer
QPI Routing Layer
QPI Protocol Layer
Computer Components

All contemporary computer designs are based on John von Neumann concepts.
It is referred to as the von Neumann architecture
It is based on three key concepts:
■■ Data and instructions are stored in a single read–write memory.
■■ The contents of this memory are addressable by location, without regard to
the type of data contained there.
■■ Execution occurs in a sequential fashion (unless explicitly modified) from
one instruction to the next.
Computer Components
Program → The process of connecting the various components in the desired configuration

hardwired program For each new program, the hardware is rewired to


Sequence of
arithmetic
obtain the desired configuration
Data Results
and logic
functions
At each step, some arithmetic or logical operation is
(a) Programming in hardware
performed on some data.

CPU
software → a sequence of codes or instructions
Instruction Instruction
codes interpreter
Each code is, in effect, an instruction.
Control
signals
Instruction interpreter part of the hardware interprets
each instruction and generates control signals.
General-purpose
Data arithmetic
and logic
Results
At each step, some arithmetic or logical operation is
functions
performed on some data. For each step, a new set of
(b) Programming in software
control signals is needed.
Hardware and Software Approaches (Fig 3.1)
Computer Components

CPU is not enough for a functional computer.

I/O components are also required.

Memory (main memory) is also required.


Computer Components
CPU Main Memory
0
memory address register (MAR)
System 1
specifies the address in memory
2
PC MAR Bus
Instruction for the next read or write
Instruction
Instruction
IR MBR
memory buffer register (MBR)
I/O AR
Execution
Data exchange of data between main
Data
unit I/O BR
Data memory and the CPU
Data

I/O Module
I/O address register (I/OAR)
n–2
n–1

specifies a particular I/O device

PC = Program counter
Buffers IR = Instruction register
MAR
MBR
=
=
Memory address register
Memory buffer register
I/O buffer register (I/OBR)
I/O AR = Input/output address register exchange of data between
I/O BR = Input/output buffer register
I/O module and the CPU
Computer Components: Top-Level View (Fig. 3.2)
Figure 3.2 Computer Components: Top-Level View
Computer Function - Instruction Cycle
Fetch Cycle Execute Cycle

Fetch Next Execute


START HALT
Instruction Instruction

Basic Instruction Cycle (Fig 3.3)

instruction processing consists of two steps → (1) fetch and (2) execute
The processing (fetch and execute) required for
a single instruction is called ‘instruction cycle’
Figure 3.3 Basic Instruction Cycle
Program execution halts only if
• the machine is turned off
• some sort of unrecoverable error occurs
• a program instruction that halts the computer
is encountered.
Computer Function - Instruction Cycle
CPU Main Memory
0 1. At the beginning of each instruction cycle the
System 1
2
Bus
PC MAR Instruction processor fetches an instruction from memory
Instruction
Instruction
IR MBR 2. The program counter (PC) holds the address
I/O AR
Execution
Data of the instruction to be fetched next
unit Data
I/O BR
Data
Data 3. The processor increments the PC after each

I/O Module n–2


instruction fetch so that it will fetch the next
n–1

instruction in sequence

4. The fetched instruction is loaded into the


PC = Program counter
IR = Instruction register
Buffers
MAR = Memory address register instruction register (IR)
MBR = Memory buffer register
I/O AR = Input/output address register
I/O BR = Input/output buffer register 5. The processor interprets the instruction and
Computer Components: Top-Level View (Fig. 3.2)
Figure 3.2 Computer Components: Top-Level View performs the required action
Computer Function - Instruction Cycle

The actions:
■■ Processor-memory: Data may be transferred
from processor to memory or from memory to
processor.
■■ Processor-I/O: Data may be transferred to or from
a peripheral device by transferring between the
processor and an I/O module.
■■ Data processing: The processor may perform
some arithmetic or logic operation on data.
■■ Control: An instruction may specify that the
sequence of execution be altered.
An instruction’s execution may involve one of these actions or a combination of them.
The PC contains 300, the address of the first instruction 16 bits long

Memory CPU Registers Memory CPU Registers


300 1 9 4 0 3 0 0 PC 300 1 9 4 0 3 0 1 PC 0 3 4 15
0 3 4
301 5 9 4 1 AC 301 5 9 4 1 0 0 0 3 AC Opcode Address
302 2 9 4 1 1 9 4 0 IR 302 2 9 4 1 1 9 4 0 IR Opcode Address
• •
• • (a) Instruction format
940 0 0 0 3 940 0 0 0 3 (a) Instruction format
941 0 0 0 2 941 0 0 0 2
0 1 15
Step 1 Step 2 0 1
S Magnitude
Memory CPU Registers Memory CPU Registers S Magnitude
300 1 9 4 0 3 0 1 PC 300 1 9 4 0 3 0 2 PC (b) Integer format
301 5 9 4 1 0 0 0 3 AC 301 5 9 4 1 0 0 0 5 AC (b) Integer format
302 2 9 4 1 5 9 4 1 IR 302 2 9 4 1 5 9 4 1 IR
• • Program Counter (PC) = Address of instruction
• •
940 0 0 0 3 940 0 0 0 3 3+2=5 InstructionProgram
RegisterCounter
(IR) = Instruction being executed
(PC) = Address of instruction
941 0 0 0 2 941 0 0 0 2 Accumulator (AC) = Temporary storage
Instruction Register (IR) = Instruction being executed
Accumulator (AC) = Temporary storage
Step 3 Step 4 (c) Internal CPU registers

Memory CPU Registers Memory CPU Registers (c) Internal CPU registers
300 1 9 4 0 3 0 2 PC 300 1 9 4 0 3 0 3 PC 0001 = Load AC from Memory hexadecimal digits
301 5 9 4 1 0 0 0 5 AC 301 5 9 4 1 0 0 0 5 AC 0010 = Store AC to Memory
302 2 9 4 1 0001 = Load
from AC from Memory 1
2 9 4 1 IR 302 2 9 4 1 2 9 4 1 IR 0101 = Add to AC Memory
• • 0010 = Store AC to Memory 2
• •
940 0 0 0 3 940 0 0 0 3 0101 = Add to (d)
ACPartial
from Memory
list of opcodes 5
941 0 0 0 2 941 0 0 0 5
(d) Partial list of opcodes
Step 5 Step 6
Figure 3.4 Characteristics of a Hypothetical Machine
Figure 3.5: Example of Program Execution (contents of memory Note: The processor contains a single data
and registers in hexadecimal) register, called an accumulator (AC).
Figure 3.4 Characteristics of a Hypothetical Machine
Computer Function - Instruction Cycle
Instruction address calculation (iac):
Instruction Operand Operand Determine the address of the next
fetch fetch store instruction to be executed.
Instruction fetch (if): Read instruction
from its memory location into the
Multiple Multiple processor.
operands results Instruction operation decoding (iod):
Analyze instruction to determine type
Instruction Instruction Operand
Data
Operand of operation to be performed and
address operation address address
calculation decoding calculation
Operation
calculation
operand(s) to be used.
Operand address calculation (oac):
If the operation involves reference to an
Return for string
Instruction complete, or vector data operand in memory or available via I/O,
fetch next instruction then determine the address of the
operand.
Figure 3.6 Instruction Cycle State Diagram Operand fetch (of): Fetch the operand
from memory or read it in from I/O.
States in the upper part of Figure 3.6 involve an exchange Data operation (do): Perform the
betweenFigure
the processor and either
3.6 Instruction Cycle memory or an I/O module.
State Diagram operation indicated in the instruction.
States in the lower part of the diagram involve only internal Operand store (os): Write the result into
processor operations. memory or out to I/O.
Computer Function - Interrupts
Program Generated by some condition that occurs as a result of an instruction
The processing execution, such as arithmetic overflow, division by zero, attempt to
of the execute an illegal machine instruction, or reference outside a user's
allowed memory space.
processors can
Timer Generated by a timer within the processor. This allows the operating
be interrupted by system to perform certain functions on a regular basis.
several factors
I/O Generated by an I/O controller, to signal normal completion of an
(Table 3.1): operation, request service from the processor, or to signal a variety of
error conditions.
Hardware failure Generated by a failure such as power failure or memory parity error.

Since I/O operations are slow, they are designed to work together with the interrupt system.
For example:
• Data is sent to the printer.
• The processor goes back to doing other tasks.
• When the printer finishes its job, it sends an interrupt: "printing is done.
• Then the processor says, "okay, now I can send more data."
Computer Function - Interrupts
User Program Interrupt Handler Fetch Cycle Execute Cycle

1 START
Fetch Next Execute
HALT
Instruction Instruction
processor’s
2
current
activity is Basic Instruction Cycle (Fig 3.3)
saved here

i
Interrupt
occurs here i+1
Figure 3.3 Basic Instruction Cycle

Workflow with interrupts (Fig. 3.8)


When an Figure
interrupt
3.8 happens,
Transfer ofthe system
Control stops
via Interrupts
what it's doing for a moment. The processor takes
Instruction Cycle with Interrupt (Fig 3.9)
care of the interrupt (like checking if I/O is done or
if data has come), then it goes back and continues In the interrupt cycle, the processor checks to see
the program from where it stopped. This helps the if any interrupts have occurred.
system work quickly and efficiently.
Computer Function - Interrupts
interrupt handler

1. The user program is running (normal process)


Example: Segment 2 is executing
2. Hardware sends an interrupt
Example: Printer says “printing completed”
3. CPU temporarily stops its current task
Program Counter + registers are saved
4. Interrupt Handler starts
→ The cause of the interrupt is identified
→ Additional action is taken if needed
5. Handler finishes, CPU returns to program
Execution continues from Segment 2
6. Program resumes normal flow
→ Proceeds to Segment 3
program code segments are shaded green, and I/O program code segments are shaded grey

Time t0 t0
1 1
t1 t1
4 4
t2 t2
I/O operation
I/O operation;
processor waits 2a concurrent with
t3 t3 processor executing

5 5
t4 t4
2b
2
4
I/O operation
4 3a concurrent with
processor executing
I/O operation;
processor waits 5

5 3b

(b) With interrupts


3

(a) Without interrupts

FigureFigure
3.103.10
Program Timing: Short I/O Wait
Program Timing: Short I/O Wait
Figure 3.11 Program Timing: Long I/O Wait
Instruction Operand Operand
fetch fetch store

Multiple Multiple
operands results
Figure 3.6 Instruction Cycle State Diagram
Instruction Instruction Operand Operand
Data
address operation address address
Operation
calculation decoding calculation calculation

Return for string


Instruction complete, or vector data
fetch next instruction

Instruction Operand Operand


fetch Figure 3.6 Instruction
fetch Cycle State Diagram store

Figure 3.12 Instruction Cycle State


Diagram, with Interrupts
Multiple Multiple
operands results

Instruction Instruction Operand Operand


Data Interrupt
address operation address address Interrupt
Operation check
calculation decoding calculation calculation

No
Instruction complete, Return for string interrupt
fetch next instruction or vector data
Computer Function – Interrupts – Multiple interrupts
User program
Interrupt
handler X A disabled interrupt strategy
simply means that the
processor can and will ignore
that interrupt request signal.
Interrupt
handler Y

Interrupts are handled in


strict sequential order.
Interr
This method is simple.
(a) Sequential interrupt processing

User program
Interrupt
handler X
handl

Interrupt
handler Y
Interrupts are prioritized, so
Figure 3.13 Transfer of Control they are handled according
with Multiple Interrupts to the relative priority
This method is efficient.
(b) Nested interrupt processing
Computer Function – Interrupts – Multiple interrupts
Printer Communication
User program
interrupt service routine interrupt service routine Interrupt service routine (ISR) is a
t=0 software routine that hardware
invokes in response to an interrupt
15
0 t=
t =1
priority:
t = 25 Communication > Disk > Printer

t= t = 25 Disk
40 interrupt service routine

t=
35 Time
1. User program starts
2. User program is interrupted
3. Printer starts
4. Printer is interrupted
5. Communication completed
Figure 3.14 Example Time Sequence of Multiple Interrupts 6. Disk completed
7. Printer completed
8. User Program completed
Interconnetion Structures
Read Memory The collection of paths connecting
Write
N Words
the various modules is called the
Address 0 Data
interconnection structure.
Data N –1

The interconnection structure must


Read
support the following types of transfers:
I/O Module Internal
Write Data

Address M Ports
External
Data ➢ Memory to processor
Internal
Data Interrupt
Signals
➢ Processor to memory
➢ I/O to processor
External
Data

➢ Processor to I/O
➢ I/O to or from memory
Instructions Address

Control
Data CPU Signals

Data
Interconnection structures:
Interrupt
Signals
(1) Bus interconnection
Figure 3.15 Computer Modules (2) Point-to-point interconnection
Figure 3.15 Computer Modules
Interconnetion Structures – Bus interconnection
A bus is a communication pathway connecting two or more devices.
Typically, a bus consists of multiple communication pathways (lines).
Each line is capable of transmitting signals (binary digits) representing binary 1 and binary 0.

For example; 8-bit unit of data eight bus lines

Only one device at a time✓


Simultane multiple signals
Most buses are shared
overlap and mix

The wires that comprise a bus are called lines

Illustration of a bus (of 16 lines) on a mother board (Fig 14.2 & 14.3 in Commer book)
Interconnetion Structures – Bus interconnection
A bus that connects major computer components (processor, memory, I/O) is called a system bus.
The most common computer interconnection structures are based on the use of one or more system buses.

Bus Interconnection (Fig 3.16)

The data lines provide a path for moving data among system modules.
The address lines are used to designate the source or destination of the data on the data bus.
The control lines are used to control the access to and the use of the data and address lines.

data lines = data bus address lines = address bus


Interconnetion Structures – Bus interconnection
data lines
The data bus may consist of 32, 64, 128, or even more separate lines, the number of lines being referred
to as the width of the data bus.
each line can carry only one bit at a time

the number of lines determines how many bits can be transferred at a time
Width is a key factor in overall performance of the system.

address lines
The width of the address bus determines the maximum possible addressable memory capacity of the system
8-bit width → 2⁸ = 256 address (means 256 memory cell)
16-bit width → 2¹⁶ = 65,536 address
The first bit defines where to
The address bus is used for not only memory module but also I/O ports.
go, memory or I/O module

control lines
Control signals transmit both command and timing information among system modules.
Interconnetion Structures – Point-to-point interconnection
• Electrical constraints encountered with increasing the frequency of wide
synchronous buses.
Problems with bus • Difficult to perform the synchronization and arbitration functions in a timely
interconnection fashion at higher data rates.
• Difficulties of increasing bus data rate and reducing bus latency to keep up
with the processors when using multicore chips.

The point-to-point interconnect has lower latency, higher data rate, and better scalability.

a well-known common point-to-point


Intel QuickPath Interconnect (QPI, developed by Intel in 2008) →
interconnection structure

Significant characteristics of QPI and other point-to-point interconnect schemes:


➢ Multiple direct connections → direct pairwise connections
➢ Layered protocol architecture → by defined rules over the layers
➢ Packetized data transfer → packets (data, control headers, error checking codes)
Direct QPI connections can be established between each pair
I/O device

I/O device
I/O Hub of core processors.
QPI is used to connect to an I/O module, called an I/O hub (IOH).

The link from the IOH to the I/O device controller uses an
DRAM

DRAM
Core Core
A B interconnect technology called PCI express (PCIe).

QPI is defined as a four-layer protocol architecture;


DRAM

DRAM
Core Core
C D ■ Physical layer → Consists of the actual wires carrying the
signals. Phit (physical unit)
■ Link layer → Responsible for reliable transmission and
I/O device

I/O device
I/O Hub
flow control. Flit (flow control unit)
■ Routing layer → directing packets through the fabric
■ Protocol layer → The high-level set of rules for exchanging
QPI PCI Express Memory bus packets of data between devices.
Figure3.17
Figure 3.17 Multicore
Multicore Configuration
ConfigurationUsing Using
QPI QPI
Packets
Protocol Protocol Packets: 1 or more Flits

Routing Routing

Flits
Link Link Flits: 80 bits each
Each data path consists of a pair of
wires that transmits data one bit at a
Physical Phits Physical
Phits: 20 bits each time; the pair is referred to as a lane.
COMPONENT A
Intel QuickPath Interconnect Port
Intel QPI Layers (Fig. 3.18)

Fwd Clk

Rcv Clk
Figure 3.18 QPI Layers Transmission Lanes Reception Lanes

Fwd Clk
Rcv Clk
Reception Lanes Transmission Lanes

There are 20 data lanes (transmit


and receive), plus a clock lane in Intel QuickPath Interconnect Port
COMPONENT B
each direction.
Figure 3.19 Physical Interface of the Intel QPI Interconnect
PCI Express (PCIe)
PCI: Pripheral Component Interconnection is a popular high-bandwidth,
processor-independent bus

PCI delivers better system performance for high-speed I/O subsystems


PCI is a bus-based interconnection structure
PCIe is a point-to-point interconnection structure

With PCIe;
• Data flow is performed according to the prioritizing system. The data with higher priority is
processed firstly. Important for many applications such as real-time data.
• To define the order in process the data is tagged to define properties as its priority, type
and sensitivity to delay.

For PCIe;
• Key requirement is high capacity to support the needs of higher data rate I/O devices,
such as Gigabit Ethernet.
• Another requirement deals with the need to support time dependent data streams
PCI Express (PCIe)
Core Core A root complex device, also referred to
as a chipset or a host bridge

Gigabit PCIe
Memory
The root complex acts as a buffering device, to deal
Ethernet
with difference in data rates between I/O controllers
Chipset
PCIe
and memory and processor components.
PCIe–PCI
Memory
Bridge

PCIe

PCIe PCIe Switch: The switch manages multiple PCIe streams.


Switch
PCIe endpoint: An I/O device or controller that
implements PCIe, such as a Gigabit ethernet switch, a
PCIe PCIe graphics or video controller, disk interface, or a
communications controller.
Legacy PCIe PCIe PCIe
endpoint endpoint endpoint endpoint

Figure 3.21 Typical Configuration Using PCIe


PCI Express (PCIe)
PCIe is defined as a three-layer protocol
architecture;
■ Physical layer → Consists of the actual
wires carrying the signals.

■ Data link layer → Responsible for reliable


transmission and flow control.

■ Transaction layer → Generates and


consumes data packets and also manages
Figure 3.22 PCIe Protocol Layers the flow control of those packets

Data packets generated and consumed by the DLL are called Data Link Layer Packets (DLLPs).
Data packets generated and consumed by the TL are called Transaction Layer Packets (TLPs).

NOTE:
Intel QPI is designed for high-speed, low-latency communication between processors.
PCIe, on the other hand, is developed for flexible, scalable, and universal communication with peripheral
(I/O) devices.
PCI Express (PCIe)
PCIe transactions are conveyed using transaction layer packets (TLPs).
When a tranfer to an I/O device is needed;
TLPs defines the high-level processes such as;
1. TLP is generated by TL
• Memory read/write
2. TLP is transferred to layer DL
• I/O read/write
3. The DL adds a Link CRC (LCRC) and
• Message
may later generate DLLPs (such as
• Configuration access
ACK/NACK) to ensure reliable delivery.
4. TLP and DLLP are sent independently
TLPs consist of following field;
over the Physical Layer.
■■ Header: The header describes the type of packet
5. The Physical Layer transmits them
and information needed by the receiver to process
serially to the receiver.
the packet, including any needed routing
information.
■■ Data: A data field of up to 4096 bytes may be A TLP originates in the TL of the sending
included in the TLP (i.e. the data itself for writing device and terminates at the TL of the
function). Some TLPs do not contain a data field. receiving device.
■■ ECRC: An optional end-to-end CRC field enables A DLLP originates in the DL of the
the destination TL layer to check for errors in the sending device and terminates at the
header and data portions of the TLP. DLof the receiving device.
Solve the problem

Fig.3.5 Fig.3.4

Notes for solution:


1. The address for I/O device 5 is 005
2. The address for I/O device 6 is 006
3. Remember the example in the previous slide and prepare a 6-step
visual solution as given in Fig 3.5. Also explain your steps as we did
in the lecture.

You might also like