0% found this document useful (0 votes)
2 views43 pages

Unit 1

The document provides an overview of computer architecture, detailing the design and implementation of instruction sets, memory addressing techniques, and the components of a computer system. It outlines eight key ideas in computer design, such as Moore's Law and performance optimization through parallelism and pipelining. Additionally, it describes the hardware and software components of a computer, including the CPU, memory types, and the role of operating systems.

Uploaded by

Krithikaa Venket
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views43 pages

Unit 1

The document provides an overview of computer architecture, detailing the design and implementation of instruction sets, memory addressing techniques, and the components of a computer system. It outlines eight key ideas in computer design, such as Moore's Law and performance optimization through parallelism and pipelining. Additionally, it describes the hardware and software components of a computer, including the CPU, memory types, and the role of operating systems.

Uploaded by

Krithikaa Venket
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT - I

OVERVIEW AND INSTRUCTION


COMPUTER ARCHITECTURE
• Computer Architecture deals with designing and implementation of instruction
set, information format and memory addressing techniques of a computer.
• Computer Organization refers to the operational units and their interconnections
that describe the function and design of various units of a computer.
• A Computer Architect performs instruction set design, and memory addressing
modes.
1.1 EIGHT IDEAS
In the last 60 years of computer design, computer architects have proposed 8 great
ideas. They are,
1. Design for Moore’s Law
2. Use Abstraction to Simplify Design
3. Make the Common Case Fast
4. Performance via Parallelism
5. Performance via Pipelining
6. Performance via Prediction
7. Hierarchy of Memories
8. Dependability via Redundancy
1.1.1 Design for Moore’s law
 Developed by Gordon E. Moore, co-founder of Intel.
 The design of a computer takes many years.
 The design of a system may start with an existing technology
 At the end of the product, the technology may grow and the product has to be
reworked.
 Hence, computer architects must imagine about the future technology (technology
at the finish time of the project) rather than designing with the existing one.
 Moore’s Law graph is given by
1.2 Computer Architecture

 The graph represents the concept: “up and to the right”, which means that the
technology changes rapidly.

1.1.2 Use Abstraction to Simplify Design


 Computer architects and programmers use abstractions (Generalization of concepts)
to represent the design at several levels.
 The detail represented at each level hides the details of lower levels.
 This may improve productivity since abstraction simplifies design and thus the
design time decreases.
 This provides a simpler design model due to abstraction.
 Example
o Operating systems hide the details involved in handling input and output
devices.
o High-level languages hide the details of the sequence of instructions need
to accomplish a task.

1.1.3 Make the common case fast


 The performance shall be improved by improving the common case rather than
developing the rare case.
 This makes the design process simpler and faster.
 The concept is often called the Amdahl’s law.
 Example
o It is easier to design a sports car having a capacity of one / two passengers
than to design a minivan with a capacity of six /seven.
Overview And Instruction 1.3

1.1.4 Performance via parallelism


 Parallelism is a process of performing multiple jobs simultaneously.
 A processor engages in several activities in the execution of an instruction.
 Each instruction is executed at the same time to increase the performance.
 Larger problems are often subdivided into smaller units and are solved concurrently
through parallelism.

1.1.5 Performance via pipelining


 Pipelining is an extension of the idea of parallelism.
 Pipeline is a set of jobs connected in series, where output of one element is the
input of the next one.
 Here, the independent elements are executed in parallel to improve performance.
 Rather than processing each instruction sequentially, every instruction is split up
into a sequence of steps so that different steps can be executed concurrently and
in parallel, to improve performance.

1.1.6 Performance via prediction


 Branch statements may cause unconditional wait, reducing the performance.
 This can be reduced by using branch predictor that guesses the path taken by
a branch statement before it is actually known.
 The branch predictor improves the flow of execution in the instruction
pipeline.
1.4 Computer Architecture

 It is performed by predicting the outcome of the condition test and then start
executing the indicated instruction rather than waiting for correct answer.
 Performance is improved if the guesses are reasonably accurate and the
penalty of wrong guesses is not too severe.

1.1.7 Hierarchy of memories


 Users need the memory to be very fast, large, and cheap.
 Computer has a range of memory units with cache and register memories being
fast and small and secondary storage memories being slow and large.
 Cache memory is a small high-speed memory that holds recently accessed data.
 The memory hierarchy is given by,
Overview And Instruction 1.5

1.1.8 Dependability via redundancy


 Computers need to be dependable since any device can fail.
 Hence several redundant modules (copies of data) can be maintained that helps
the user to recover data when a failure occurs.
 One of the finest ideas in data storage is the RAID concept (Redundant Array
of Inexpensive Disks).
 Data is stored redundantly on multiple disks that services us to recover them back.

1.2 COMPONENTS OF A COMPUTER SYSTEM


Concept
 A computer is an information processing machine.
 It consists of a number of interrelated components that work together to convert data
into information.
 The processing is carried out electronically, usually with no intervention from a human
user.
 Input unit accepts the information from the user using input devices.
 The information received is either stored in the computer’s memory for later
reference or immediately used by the arithmetic and logic unit to perform the
desired operations.
 The information is processed using the instructions (software) stored in the
computer.
 The results are sent back to the user through the output unit.
 All the above actions are coordinated by the control unit.
 The list of instructions that performs a task is called a program, which is stored
in the memory.
1.6 Computer Architecture
Components of a computer
1. Hardware component
2. Software component

1.2.1 Hardware component


The electronic components interconnected in the computer system constitute the
hardware components of a system.
A computer consists of the following functional units.
 Input Unit
 Central Processing Unit
o Memory Unit – Primary memory, secondary memory, Cache memory,
Registers
o Arithmetic and logic Unit
o Control Unit
 Output Unit
Functional units of a Computer

ALU

Input unit Memory unit Output unit

Control unit
Overview And Instruction 1.7
[Link] Input Unit
 They are electromechanical devices that allow the user to provide information
into the computer for analysis and storage inside the CPU.
 Input device captures information and translates it into a form that can be processed
by the CPU.
 Computer accepts input in two ways. They are,
o Manual entry  the information is entered using keyboard or mouse.
o Direct entry  the information is fed into the computer automatically
from a source document like barcode.
 Examples for Input devices: Keyboard, pointing devices like Mouse, Joystick.
 Whenever a key is pressed, the corresponding letter or digit is automatically
translated into its corresponding binary code
 It is then transmitted over a cable to either the computer memory or the processor.
[Link] Central Processing Unit (CPU)
 It is referred as ‘the brain of a computer system’.
 It converts data (input) into meaningful information (output).
 It is a highly complex, extensive set of electronic circuitry which executes stored
program instructions called software.
 It controls all internal and external devices and performs arithmetic and logic
operations
 It controls the usage of main memory to store data and instructions and controls the
sequence of operations.
 It consists of three main subsystems.
o Arithmetic and Logic Unit (ALU)
o Memory Unit
o Control Unit (CU)

[Link] Arithmetic Logic Unit (ALU)


It contains the electronic circuitry that executes all arithmetic and logical operations on
the data. ALU comprises of two units
1.8 Computer Architecture

Arithmetic Unit
 Contains the circuitry that is responsible for performing the actual computing and
carrying out the arithmetic calculations (+, -, *, /).
 To perform these operations, operands from the main memory is bought into processor
 After performing the operation results are stored in the memory location
 It can perform these operations at very high speed.

Logic Unit
 It enables the CPU to perform logical operations based on the instructions provided
to it.
 Example: Logical comparison between data. ( Logical Operations : =, <, > conditions)
[Link] Memory Unit
 Memory refers to the electronic holding place for instructions and data.
 Memory also stores the intermediate results and output.
 Memory is mainly classified into two categories: primary and secondary.

Primary Memory
 It is also knows as main memory/ internal memory/ in-built memory.
 It stores data and instructions for processing.
 It is an integral component of CPU.
 It is a fast memory that operates at electronic speeds.
 The memory contains large no of semiconductor storage cells.
 Each cell carries 1 bit of information.
 The Cells are processed in a group of fixed size called Words.
 To provide easy access to any word in a memory, a distinct address is associated
with each word location.
 Addresses are numbers that identify successive locations.
 The number of bits in each word is called the word length.
 The word length ranges from 16 to 64 bits.
 It can be classified as random access memory (RAM) and Read only memory (ROM).
Overview And Instruction 1.9

Differences between RAM and ROM

Features RAM ROM

Stands for Random Access Memory Read-only memory

Volatility RAM is volatilei.e. its contents It is non-volatile i.e. its contents are
are lost when the device is retained even when the device is
powered off. powered off.

Types The two main types of RAM are The types of ROM include PROM,
static RAM and dynamic RAM. EPROM and EEPROM.

Use RAM allows the computer to ROM stores the program required
read data quickly to run to initially boot the computer.
applications. It allows reading It only allows reading.
and writing.

Definition Random Access Memory or RAM Read-only memory or ROM is also


is a form of data storage that a form of data storage that cannot
can be accessed randomly at be easily altered or reprogrammed.
any time, in any order and from
any physical location.

Secondary Memory

 It is also known as auxiliary memory or external memory.

 It is used for storing software programs and data.

 It is less expensive and stores huge volume of data than primary memory.

 The data and instructions stored on such devices are permanent in nature.

 It can be removed only if the user wants or if the device is destroyed.

 Example: Pen drive, Floppy Disk, Compact Disk, External hard disk, etc.
1.10 Computer Architecture
Differences between primary and secondary memory
Sl. No. Primary Memory Secondary Memory

1. It is also called as Main or Internal It is also known as Secondary or


or Built-in memory. Auxiliary or External memory.

2. It is present inside the computer. It is present external to the computer that


can be connected.

3. It is smaller in size and holds It is larger in size and holds massive


limited data. amount of data.

4. Due to its locality, it transfers data Since it is outside the CPU, it is slower
faster. when compared to primary memory.

5. It is costlier than secondary memory. It is cheaper than main memory.

6. Example: RAM, ROM Example: Disk drives, optical disks,


magnetic tape drives.

Cache memory
 Cache is a high speed memory located in between RAM and the CPU.
 It increases the speed of processing since it holds the most frequently used data.
 It is highly expensive and smaller in size.
 It is present in two or three forms in a system – L1, L2 L3 cache memories.
 The memory ranges from 256 KB to 2 MB.

Register memory
 Registers are special-purpose, high speed temporary memory units
 It holds various types of information such as data, instructions, addresses and the
intermediate result of calculations.
 It holds the information that the CPU is currently working on.
 It is said to be “CPU’s working memory” or an additional storage location that
offers the advantage of speed.
Overview And Instruction 1.11

 Registers are of two types: general purpose and special purpose registers.

o General purpose registers

 These are a set of registers that store temporary data and addresses
by the programmer.

 It includes floating point registers, constant registers, vector


registers etc.

o Special purpose registers

 These are registers that are meant for performing special purposes
by the CPU.

 They include Program counter (PC), accumulator (ACC),


instruction register (IR), memory address register (MAR), memory
buffer register (MBR), memory data register (MDR) etc.

 Program counter contains the address of the next instruction


to be processed.

 Accumulator stores the result of various arithmetic and logical


operations.

 Instruction register holds the current instruction that is fetched.

 MAR contains the address of the next location in the memory


that is to be accessed.

 MBR stores the temporary data and MDR stores the operands
of the expression processed.

[Link] Control Unit


 This unit checks the correctness of sequence of operations.
 It fetches the instructions from the primary memory, interprets them and ensures
correct execution of the program.
 It also controls the input and output devices
 Directs the overall functioning of the other units of the computer.
1.12 Computer Architecture

 Control Unit
o Supervises and controls the path of information that runs over the
processor.
o Organizes the various activities of those units that lie inside it.
o Guides the flow of data through the different parts of the computer.
o Interprets the instructions.
o Regulates the time controls of the processor.
o Sends and receives control signals from various peripheral devices.

[Link] Output Devices

 Output Devices take the machine-coded output results from the CPU and convert
them into a form that is easily readable by human beings.

 The output can be obtained in two forms: hardcopy and softcopy.

o The physical form of the output is called as hardcopy.

o The output which resides in the memory is called as softcopy.

 Example: Monitors, Printers, Plotters and audio response etc.

1.2.2 Software Component

 Software is the collection of instructions written in a computer language.

 It is responsible for controlling, integrating and managing the hardware


components of a computer and to accomplish a specific task.

 Software instructs the hardware to perform the desirable task to be done.

Types
 System Software
 Application Software

[Link] System Software


 System software is a program that manages and supports the computer resources
and operations of a computer system.
Overview And Instruction 1.13

 It executes various tasks such as processing data and information, controlling


hardware components, and allowing users to use application software.

o It is more transparent and less noticed by the users.

o They usually interact with the hardware or the applications.

o Basic functionality includes file management, visual display, keyboard


input, etc.

 Example : Operating systems, device drivers, language translators, text editors,


utilities, loaders, linkers, etc.

Operating System
 An operating system (OS) is a collection of software that manages computer
hardware resources and provides common services for computer programs.
 The operating system is a vital component of the system software in a computer
system. Application programs require an operating system to function.
 The functions of OS include Disk Access, Memory Management, Task Scheduling,
and User Interfacing.
 Provides a software platform on top of which other programs run.
 Example: MS –DOS, WINDOWS, LINUX, UNIX.

Device Drivers
 These are system programs which are responsible for proper functioning of
devices.
 Whenever a new device is added to a computer system the driver must be installed
before the device is used.
 It acts as a translator between the device and the program that uses the device.
 It is not an independent program; it assists or is assisted by the OS for proper
functioning.
 Example: printer, monitor, mouse, keyboard.
1.14 Computer Architecture

Programming Language Translators


 The language translators transform the instructions prepared by programmers in
a high level language into the form which can be understood by the computer.
 Translators are divided into 3 categories: compiler, interpreter, and assembler.
 Compiler
 Compiler translates the high level programming language into machine
language.
 It translates source code into object code.
 It can be used for larger applications.

 Example: C, C++, PASCAL compilers.

 Interpreter
 Interpreters translate the source code into object code in line-by-line
manner, without looking at the entire program.
 Programs produced by compilers run much faster than interpreter.
 It is easier to modify the source code.
 Example: Basic.

 Assembler
 Assemblers translate assembly language program into machine language.

System Utility

 These programs perform day-to-day tasks related to the maintenance of the


computer system.

 They are used to support, enhance and secure existing programs and data in the
computer system.

 They are generally small programs having specific task to perform.

[Link] Application Software

 This is the most used software by the users.


Overview And Instruction 1.15

 It is used to accomplish specific tasks.

 Application software consists of a single program (Ex. Notepad) or a collection


of programs (software package) (Ex. Microsoft Office Suite).

Some of the most commonly used application software are as follows.

Word Processors
 A word processor is software used to compose, format, edit and print electronic
documents.
 We can include pictures, graphs, and charts and allows changes in alignments,
margins, font, and color and also allows spell checking.
 Example: Microsoft word, word perfect

Spreadsheets
 A spreadsheet application is a rectangular grid, which allows text, numbers and
complex functions to be entered into a matrix of thousands of individual cells.
 Applications include payroll processing, financial record maintenance.
 Example: Microsoft Excel, Lotus 1-2-3.

Image Editors
 Image editor programs are designed specifically for capturing, creating, editing
and manipulating images.
 The programs provide a variety of special features for creating and altering images.
 They also enable the user to create and superimpose layers, import and export
graphic files, adjust an image and improve its appearance
 Example: Adobe Photoshop, Adobe Corel Draw.

Database Management Systems


 Database Management software is a collection of computer programs that
supports structuring of the database in a standard format
 It provides tools for data input, verification, storage, retrieval, query and
manipulation in an efficient manner.
1.16 Computer Architecture

 It controls the security and integrity of the database from unauthorized access.

 Example: Oracle, FoxPro.

Presentation Applications
 A Presentation is a means of assessment, which requires presentation providers
to present their work orally in the presence of an audience.
 It combines both visual and verbal elements.
 Presentation software allows the user to create presentations by producing
slides/hand-outs for the presentation of projects.
 Example: Microsoft PowerPoint.

Desktop Publishing Software


 The desktop publishing is a technique of using a personal computer to design
images and pages, and assemble type and graphics, then using a printer to
output the assembled pages onto paper.
 This software is used for creating magazines, books etc.
 Example: Adobe PageMaker, Quark Express.
1.3 BUS STRUCTURE
 A group of lines that serves as the connecting path from one device to another is
called a Bus.
 A Bus may be defined as a set of communication lines/ data paths that carry the
data, address or control signal among various units of a CPU.
 A memory/ system bus interconnects the processor with memory units and I/O
units.
 When a word of data is transferred between units, all bits are transferred in parallel.
 The bits are transferred simultaneously over many wires, or lines, with one bit
per line.
 There are 2 types of Bus structures. They are,
o Single Bus Structure
o Multiple Bus Structure
Overview And Instruction 1.17

Single Bus Structure


 All the components (I/O, memory and processor) are connected to a common
bus.
 It allows only one transfer at a time because only two units can actively use the
bus at any given time.
 Advantage of using single bus structure is
o Low cost
o High flexibility in attaching peripheral devices.
 The only problem is that the performance is low due to slow data transfer.

Figure 1.3 Single Bus Structure

Multiple Bus Structure

 To achieve high performance, multiple bus structures are used.

 This allows two or more transfers to be carried out at the same time, concurrently.

 This provides a better performance but at an increased cost.


1.18 Computer Architecture

 The devices connected to a bus vary widely in their speed of operation.


 Hence buffer registers are used to hold information during transfers.
 Thus, the buffer register prevents a high speed processor from being locked to a
slow I/O device during data transfer.
 This allows the processor to switch rapidly from one device to another.

1.4 TECHNOLOGIES FOR BUILDING PROCESSORS AND MEMORY


 A computer architect must plan the design based on the future technology changes
 The designer must be aware of rapid changes in implementation technology.
 There are four implementation technologies to be considered. They are,
o Integrated Circuit logic technology
o Semiconductor DRAM
o Magnetic disk technology
o Network technology

Integrated circuit logic technology


 A transistor is a small electronic device made of semiconductor material that
carries current and amplify.
 It was used in II generation computers and was very slow.
 Due to its low capacity, Integrated circuits(IC) were introduced.
 The IC technology emerged with the fabrication of transistors on a single chip.
 The Small Scale Integration (SSI) technology involved few transistors (< 100)
on a silicon chip.
 Then emerged the Medium Scale Integration (MSI) composing hundreds of
transistors on a chip.
 The III generation computers relied on SSI and MSI technology.
 The IV generation computers used Large Scale Integration (LSI) and Very Large
Scale Integration (VLSI) technology that was composed of thousands of transistors
on a single chip.
Overview And Instruction 1.19

 The V generation computers being dependent on the Ultra Large Scale Integration
(ULSI) technology that possesses several millions of transistors on a chip.

 The transistor density increases by about 35% per year.

Year Technology used in computers Relative performance/unit cost

1951 Vacuum tube 1

1965 Transistor 35

1975 Integrated circuit 900

1995 Very large-scale integrated circuit 2,400,000

2005 Ultra large-scale integrated circuit 6,200,000,000

Semiconductor DRAM (dynamic random-access memory)


 DRAM is a semiconductor memory device that stores each bit of data in a separate
passive electronic component like capacitor.
 Capacity increases by about 40% per year every two years.

Magnetic disk technology


 Magnetic storage is the storage of data on a magnetized medium.
 The density increased by about 30% per year before 1990, and increased to 100%
per year in 1996.
 It has again dropped back to 30% per year since 2004.
 But still, disks are still 50 –100 times cheaper per bit than DRAM.

Network technology
 In order to provide communication from one computer to another, LAN (Local
Area Network) like Ethernet, MAN (Metropolitan Area Network), WAN (Wide
Area Network) like internet and Wireless Network like Wi-Fi, Bluetooth were
introduced.
 Networking technologies allow data sharing, communication
 Although technology improves continuously, the impact of these improvements
can be in distinct leaps.
1.20 Computer Architecture

1.5 PERFORMANCE
 The prime factors of the success of a computer are the speed and cost.
 Performance depends on how fast machine instructions can be brought into
the processor for execution and how fast they can be executed.
 The performance of a computer is dependent on
o the design of the compiler
o the machine instruction set
o the hardware
Terminologies
Response time
 It is the time between the start and completion of a task.
 It refers the execution time of a set of instructions.
 The faster execution of instructions leads to reduction in response time, thus
increases throughput.
Throughput
 It is the total work done in a unit time period.
Elapsed Time
 The total time required to execute the program is called the elapsed time.
 It depends on all the units in computer system.
Processor Time
 The period in which the processor is active is called the processor time.

 It depends on hardware involved in the execution of the machine instruction.

Clock

 The Processor circuits are controlled by a timing signal called a clock.


Clock Cycle

 The clock defines a regular time interval called clock cycle.


Overview And Instruction 1.21

Bandwidth
 The amount of data that can be transferred from one point to another in a given
time period is called bandwidth.
 It is expressed in bits per second (bps).

Clock cycles per instruction (CPI)


 Average number of clock cycles per instruction for a program or program fragment.
CPU clock cycles = Instructions for a program × Average clock cycles per instruction

Execution of an Instruction
 At the start of execution all program instructions and the required data are stored
in the main memory.
 As execution proceeds, instructions are fetched one by one over the bus into the
processor, and a copy is placed in the cache.
 When the execution of an instruction calls for data located in the main memory,
the data are fetched and a copy is placed in the cache.
 A Program will be executed faster if the movement of instruction and data between
the main memory and the processor is minimized, which is achieved by using the
Cache.
 To execute a machine instruction, the processor divides the action to be performed
into a sequence of basic steps; each step can be completed in one clock cycle.
Clock Rate, R =1/P (measured in cycles per second)
where, P  Length of one clock cycle

Basic Performance Equation


Let
T  the processor time required to execute a program.
N  the actual number of instruction executions
S  Average number of basic steps needed to execute one machine instruction
If the clock rate is R cycles/second, the program execution time is given by
T = (N*S)/R
1.22 Computer Architecture

where, TPerformance Parameter


RClock Rate in cycles/sec
NActual number of instruction executions
SAverage number of basic steps needed to execute one machine instruction
To achieve high performance, the computer designer must reduce the value of T, which
means reducing N and S, increasing R.
N, S<R
The value of N is reduced if the source program is compiled into fewer machine
instructions. The value of S is reduced if instructions have a smaller number of basic steps to
perform or if the execution of instructions is overlapped.

Pipelining
A considerable improvement in performance can be achieved by overlapping the execution
of successive instructions. This technique is called pipelining.

Superscalar Operation
Multiple instruction pipelines can be implemented in the processor that allows several
instructions can be executed in parallel by creating parallel paths. This mode of operation is
called the Superscalar execution.

Clock Rate
There are 2 possibilities to increase the clock rate(R).They are,

 Improving the integrated Chip(IC) technology makes logic circuits faster,


which reduces the time needed to complete a basic step. This allows the clock
period, P, to be reduced and the clock rate, R, to be increased.

 Reducing the amount of processing done in one basic step also helps to reduce
the clock period P.

Performance Improvement
 To maximize the performance, minimize the response time or execution time
for some task.
 The performance of the computer is directly related to performance and
execution time for computer, X.
Overview And Instruction 1.23

1
Performance X 
Execution Time X
 For two computers X and Y, the performance of X is greater than Y then we
have
PerformanceX  PerformanceY

1 1

ExecutionTimeX ExecutionTimeY

Execution TimeY  Execution Time X ExecutionTimeY  ExecutionTimeX

Example
Time taken to run a program = 10s on A, 15s on B
Relative performance =Execution TimeB / Execution TimeA
=15s/10s
=1.5
So A is 1.5 times faster than B
Measuring Performance
 Measured in terms of seconds per program
 Defined as the total time taken to complete a task. The task includes
o disk access
o memory access
o I/O activities
o Execution of Instructions
 This time taken is known as wall-clock time / response time.

CPU Time
 Time the CPU spends computing for particular task and does not include the
time waiting for I/O.
 This is also called as CPU execution time
1.24 Computer Architecture

Formula

CPU time spent in the program(user CPU time)


CPU time 
CPU time spent in the operating system(system CPU time)

Example
Let
 User CPU time = 90.7 seconds
 System CPU time = 12.9 seconds
 Elapsed time = 2 minutes and 39 seconds (159 seconds)
 CPU time
90.7  12.9
CPU Time   0.65
159

Performance Equation I

CPU execution time for a program  CPU clock cycles for a program  Clock cycle time

Clock rate is inverse of clock cycle time

 CPU Execution time

CPU clock cycles for a program


CPU execution time for a program 
Clock rate

 Performance is improved by reducing the length of the clock cycle or number of clock
cycle required for a program
 Execution time depends on the number of instructions in a program
 Number of clock cycles

CPU clock cycles  Instructions for a program  Average clock cycles per instruction
 Clock cycles per instruction (CPI)
o Average number of clock cycles in which each instruction takes to execute
Overview And Instruction 1.25

Performance Equation II

 CPU Execution Time in terms of instruction count, CPI and clock cycle
time

CPU Execution time = Instruction count  CPI  Clock cycle time

(OR)
Instruction Count  CPI
CPU Execution Time 
Clock Rate

1.6 POWER WALL


 Processors run at high clock speed and it generates more heat and consumes
high power to improve performance.
 If clock rate increases, power consumption also gets increased.
Power Issues
 Power has to be provided to the processor and it has to be distributed around
the chip.
 Power consumed by a device dissipates it in terms of heat and it must be
removed.
Power Consumption
 In CMOS chips, dynamic power refers to the dominant energy consumption
in switching transistors.
 The power required per transistor is proportional to the product of the load
capacitance of the transistor, the square of the voltage, and the frequency of
switching.
 Power consumed by CPU is given by

Pdynamic = CV2F
where,
 P is Power
 C is capacitive loading
 V is voltage applied
 F is frequency running
1.26 Computer Architecture

 Mobile devices care about battery life more than power, so energy is the proper
metric to be considered.
 Energy is measured in joules.
Energydynamic = Capacitive load x Voltage2
 In CMOS, static power is becoming an important issue because leakage current
flows even when a transistor is off.
 Thus, increasing the number of transistors increases power even if they are turned
off, and leakage current increases in processors with smaller transistor sizes.
 As a result, very low power systems are even gating the voltage to inactive modules
to control loss due to leakage.

Power static = Current static x Voltage

1.7 UNIPROCESSORS TO MULTIPROCESSORS

 Because of limitation forced by power consumption there is a change in


the design of microprocessor.

 Rather than continuing to decrease the response time of a single program


running on the single processor, designer came up with multiple processors
per chip.

 Intention is to reduce the throughput rather than decreasing the response


time.

 To reduce the confusion between the words processor and microprocessor,


processors are referred as core such microprocessors are known as” multicore
microprocessor”.

 Example

o Dual core microprocessor is a chip that contains two processors or


cores.

o Quad core microprocessor is a chip that contains four processors or


cores.
Overview And Instruction 1.27

 In previous days, programmers rely on hardware, architecture and compiler


to double their performance of the program.

 Now-a-days, programmers rewrite their programs to support parallelism.

1.7.1 Multi-Processor
 Computers that contain several processor units are called multiprocessor
system.
 These systems either execute a number of different application tasks in
parallel or execute subtasks of a single large task in parallel.
 All processors have access to all memory locations and they are called
shared memory multiprocessor systems.
 The high performance of these systems comes with much increased
complexity and cost.
1.7.2 Advantage of Multiprocessor System

 Improves cost or performance ratio of the system.

 Tasks are divided among several modules/processors.

 If failure occurs, it is cheaper and easier to find and replace the malfunctioning
processor.

 If fault occurs in one processor, the others processor can take the responsibility
of performing the task of failure processor.
1.28 Computer Architecture

1.8 INSTRUCTIONS
 An instruction is a piece of a program that performs an operation issued by
the computer processor.
 Every instruction is defined by the instruction set of the processor.

1.8.1 Instruction Set


 A list of all the instructions with all their variants that can be executed by
a processor is called instruction set.
 It is a group of commands defined by the processor in machine understandable
language.

Example
 Arithmetic instructions Add, Subtract, Multiply and Divide
 Logic instructions  And (Conjunction) , Or (Disjunction), Not (negation)
 Control flow instructions  Goto, if ... goto, call, and return
 Data handling and memory instructions  Read, Write, Copy, Set, Load,
Store

Instruction Sets are differentiated based on


 Operand storage in the CPU (data can be stored in a stack structure or in
registers)
 Number of explicit operands per instruction (zero, one, two, and three address)
 Operand location (instructions can be classified as register-to-register, register-
to-memory or memory-to-memory)
 Operations
 Type and size of operands (operands can be addresses, numbers, or even
characters)

Format

An instruction has three fields, namely-


Overview And Instruction 1.29

 Operation code (Opcode) specifies which type of operation to be performed.

 Mode Field specifies the way the operand or effective address is determined.

 Address Field specifies memory address or a processor register.

Opcode Mode field Address field

1.8.2 Types of Instruction format


0 – Operand instruction
 These instructions have no address fields.
 They are also called as Zero address instruction or Stack instruction.
 They do not have source / destination addresses. The address is implicit.
 All the operations are done using stack data structure.
 The operands are present on the top of the stack.
 In other words, the absolute address of the operand is held in a special
register that is automatically incremented (or decremented) to point
to the location of the top of the stack.
 Syntax
Stack_Operation / Operation
 Example
To perform C = A + B, the instructions are,
PUSH A //Inserts the data A onto the stack
PUSH B //Inserts the data B onto the stack
ADD //Adds the value of A and B
POP C //Gets the added value from the stack
 Advantages
o It is a simple model of expression evaluation.
o The instructions are short.
1.30 Computer Architecture
 Disadvantages
o A stack can’t be randomly accessed.
o This makes it hard to generate effective code.
o Since the same stack is used for every operation, it creates a bottleneck.
1 – Operand instruction
 These instructions contain one address field.
 They are also known as one address instruction or Accumulator instruction.
 Accumulator (ACC) register is used for manipulation of data.
o All the operations are carried out between the accumulator
register and a memory operand.
 Syntax
Operation Destination_Location
 Example
ADD A is equivalent to ACC ACC + A
Where, A  Destination operand
The Arithmetic Operation, C = A + B is performed as,
LOAD A
ADD B
STORE C
 Advantage
o The instructions are short.
 Disadvantage
o The accumulator is only a temporary storage so memory traffic
is the highest for this approach also.
2 - Operand Instructions
 These instructions contain two address fields namely, the source and
the destination.
 Each address field specifies either a processor register or a memory.
Overview And Instruction 1.31
 They are also called as two – Address instructions or general purpose
register instructions.
 Syntax
Operation Destination_Location, Source_Location
 Example
To perform C = A + B, an intermediate register R1 is used as,

LOAD R1,A
LOAD R1, A
LOAD R2, B
ADD R1, B or
ADD R1, R2
STORE C, R1
STORE C, R1

Advantage
o Makes code generation easy.
 Disadvantage
o All operands must be named leading to longer instructions.
3 - Operand instruction
 These instructions contain three address fields.
 They are also called as three address instructions or general
purpose register instruction.
 Register address field may be a processor register or a memory
operand.
 Syntax
Operation Source1_Location, Source2_location,
Destination_Location
 Example: To perform C = A + B, the code is

ADD A, B, C or MOVE R1, A or LOAD R1, A


ADD C, R1,B LOAD R2, B
ADD R3, R1, R2
STORE C, R3
1.32 Computer Architecture
 Advantage
o Makes code generation easy.
 Disadvantage
o All operands must be named leading to longer instructions.
1.8.3 Instruction Execution
The four phases in instruction execution are
 Fetch the Instruction from memory - the instruction is fetched from the memory
location whose address is in Program Counter (PC) and is placed in the instruction
register.
 Decode the Instruction.
 Execute the Instruction - the operands are fetched from the memory or processor
registers, and the operation is performed.
 Store the result in the destination location.

1.8.4 MIPS Instruction Formats

  R - Format

o Opcode = 0
o Three register operands: rs, rt, and rd
 rs and rt - sources
 rd - destination
o shamt field - used only for shifts
o funct field - The ALU function (add, sub, and, or, and slt) and
is decoded by the ALU control design
Overview And Instruction 1.33

ALU control lines Function

000 AND

001 OR

010 Add

110 Subtract

111 Set on less than

 I - Format

o For load and store instructions


 Opcode = 35(for load) and Opcode = 43( for store)
 rs - the base register
 rt is
 For loads, the destination register for the loaded value
 For stores, the source register whose value should be stored into memory
 The memory address is computed as
Memory address = base register +16-bit address field
o For branch instructions
 Opcode = 4
 rs and rt are the source registers that are compared for equality
 The branch target address is computed as
Target address = PC + (signed-extended 16-bit offset address
<< 2)

 J – Format
1.34 Computer Architecture

o Opcode = 2
o The destination address is computed as
Target address = PC [31-28]  (offset address << 2)

1.9 LOGICAL INSTRUCTIONS


 Instructions that perform logical operations and manipulate Boolean values are
called as logical instructions.
 They include Logical AND, OR, and NOT.
1.9.1 AND instruction
 It contains three register operands.
 These instructions perform bitwise AND operation between the source registers
and stores the result in the destination register.
 It is also called as conjunction operation.
 Syntax
Operation destination, source1, source2
 Example
AND R3, R1, R2 //Equivalent to R3 = R1 & R2
1.9.2 OR instruction
 OR instruction contains three register operands.
 It performs bitwise OR operation between the source registers and stores the result
in the destination register.
 This is also called as disjunction operation.
 Syntax
Operation destination, source1, source2
 Example
OR R3, R1, R2 //Equivalent to R3 = R1 | R2
Overview And Instruction 1.35
1.9.3 NOR instruction
 These instructions have three register operands.
 It performs bitwise NOR operation (OR operation followed by NOT) between
two source registers and stores the result in the destination register.
 Syntax
Operation destination, source1, source2
 Example
NOR R1, R2, R3 // Equivalent to R1 = ~ (R2 | R3)
1.9.4 AND Immediate (ANDI) instruction
 This instruction contains three register operands.
 It performs bitwise AND operation between a source register and specified
immediate value and stores the result in the destination register.
 Syntax
Operation destination, source1, Immediate_Value
 Example
AND R1, R2, Immediate_Value // Equivalent to R1 = R2 & Imm_Val
1.9.5 OR Immediate (ORI) instruction
 This instruction has three register operands.
 It perform bitwise OR operation between a source registers and specified immediate
value and finally stores the result in the destination register.
 Syntax
Operation destination, source1, Immediate_Value
 Example
OR $1, $2, immediate_Value //Equivalent to R1 = R2 | Imm_Val
1.9.6 Shift Left Logical instruction
 This instruction contains three register operands.
 It shifts the given register value left by the shift amount listed in the instruction
and stores the result in a third register.
1.36 Computer Architecture
 Syntax
Operation destination, source1, constant
 Example
SLL R1, R2, 10 // Equivalent to R1 = R2 << 10
1.9.7 Shift Right Logical instruction
 This instruction has three register operands.
 It shifts the specified register value right by the shift amount listed in the instruction
and stores the result in a third register.
 Syntax
Operation destination, source1, Constant
 Example
SRL R1, R2, 10 // Equivalent to R1 = R2 >> 10
1.9.8 Shift Right Arithmetic instruction
 This instruction possesses three register operands.
 It shifts a register value right by the shift amount listed in the instruction and
places the result in a third register.
 Syntax
Operation destination, source1, Constant
 Example

SRA R1, R2, 10 // Equivalent to R1 = R2 >> 10

1.10 CONTROL OPERATIONS


 Control statements are those that take a decision out of or without a condition.
 These instructions perform a test by evaluating a logical condition.
 Depending on the outcome of the condition, it modifies the PC to take the
branch or continue to the next instruction.
 Control operations are of two types.
 Conditional branch  Performs branching based on a condition
 Unconditional branch  Performs branching without any condition
Overview And Instruction 1.37

1.10.1 Conditional Branch

An instruction that directs the computer to another part of the program based on the
results of a comparison is called conditional branching.

[Link] BEQ Instruction

 BEQ stands for branch on equal.

 The instruction checks if the two register values are equal. If so, it branches to
the specified offset.
 Syntax
Operation source1, source2, offset
 Example
BEQ R1, R2, OFFSET // Equivalent to
if (R1==R2) goto L1;
a=b+c;
L1: a=b-c;
[Link] BNE Instruction

 BNE stands for branch on not equal.

 The instruction performs branching if the two registers values are not equal.
 Syntax
Operation source1, source2, offset
 Example
BNE R1, R2, OFFSET // Equivalent to
if (R1 != R2) goto L1;
a=b+c;
L1: a=b-c;
1.10.2 Unconditional Branch
An instruction that directs the computer to another part of the program without any
test/condition operation is called unconditional branching.
1.38 Computer Architecture

[Link] J Instruction

 J stands for jump.


 It Jumps to the specified address.
 Syntax
Operation offset
 Example
J Target_Address;
[Link] JAL Instruction

 JAL stands for Jump and link.


 The instruction jumps to the specified address and stores the return address.
 Syntax
Operation offset
 Example
JAL Target_Address;
[Link] JR Instruction
 JR stands for Jump Register.
 It jumps to the address contained in the specified register R.
 Syntax
Operation source
 Example
JAL R1;
Overview And Instruction 1.39

1.11 MIPS ADDRESSING AND ADDRESSING MODES


 There are various ways to specify the address of the operands for any given
operations such as load, add or branch. The different ways of determining the
address of the operands specified in an instruction are called addressing modes.
In other words, addressing mode specifies how address is determined which may
be either memory or register.
 MIPS addressing modes are as follows:
 Immediate addressing: The operand is a constant within the instruction itself
 Register addressing: The operand is a register
 Base or displacement addressing: The operand is at the memory location whose
address is the sum of a register and a constant in the instruction
 PC-relative addressing: The branch address is the sum of the PC and a constant
in the instruction
 Pseudo-direct addressing: The jump address is the 26 bits of the instruction
concatenated with the upper bits of the PC
1.11.1 Immediate Addressing

 The instruction contains an immediate operand that has a constant value or an


expression.
Instruction
op rs rt Immediate

 The operand is embedded inside the instruction


 Since the instruction does not require an extra memory access to fetch the
operand, it executes faster.
 Example

:
1.40 Computer Architecture

1.11.2 Register Addressing

 It is the simplest addressing modes of all. The instruction works on register


operands. It works much faster than other addressing modes because it does not
involve with memory access.

 The register address is specified as a part of the instruction.

Effective Address, EA = Register Address, A

 Example

1.11.3 Base Addressing

 The address of the operand is the sum of the immediate and the value in a register
(rs). 16-bit immediate is a two’s complement number

 Effective Address = A + (R)

 Examples:
Overview And Instruction 1.41
1.42 Computer Architecture

1.11.4 PC Relative Addressing

 For relative addressing, the implicitly referenced register is the program counter (PC).
 PC-relative addressing is used for conditional branches. The address is the sum of
the program counter and a constant in the [Link]

1.11.5 Pseudo Direct Addressing


In Register Direct Addressing, the value the (memory) effective address is in a
register. It is also called “Indirect Addressing”. Special case of base addressing where offset
is 0. Used with the jump register instructions.
Example: jr $31
Overview And Instruction 1.43

Direct Addressing: the address is “the immediate”. 32-bit address cannot be embedded in a 32-
bit instruction.

Pseudodirect addressing: 26 bits of the address is embedded as the immediate, and is used as
the instruction offset within the current 256MB (64MWord) region defined by the MS 4 bits of
the PC.

Example: j Label

You might also like