Pipelining and Vector Processing
1. PARALLEL PROCESSING
• Parallel computing architectures breaks the job into discrete parts
that can be executed concurrently.
• Each part is further broken down to a series of instructions.
Instructions from each part execute simultaneously on different
CPUs.
• Parallel systems deal with the simultaneous use of multiple
computer resources that can include a single computer with multiple
processors, a number of computers connected by a network to form
a parallel processing cluster or a combination of both.
• Real life example: Serial computing: People standing in a queue
waiting for movie ticket and there is only cashier. Cashier is giving
ticket one by one to the persons. Complexity of this situation
increases when there are 2 queues and only one cashier.
Parallel computing: complexity will decrease when there are 2
queues and 2 cashier giving tickets to 2 persons simultaneously.
Importance of Parallel Processing
• Parallel processing reduces execution time by performing multiple
operations simultaneously instead of sequentially.
• It improves overall system performance and speed.
• It ensures better utilization of system resources such as CPU cores
and memory.
• Parallel processing helps in handling large data sets and complex
problems efficiently.
• It increases system throughput by completing more tasks in a given
amount of time.
• It supports scalability, as performance can be improved by adding
more processors.
• It is essential for modern applications like artificial intelligence,
scientific simulations, and image processing.
Levels of Parallelism:
Levels of parallelism in computing refer to different ways tasks are broken down
and executed simultaneously
1. Bit-level parallelism: It is the form of parallel computing which is based
on the increasing processor’s size. It reduces the number of instructions that
the system must execute in order to perform a task on large-sized data..
2. Instruction-level parallelism: Instruction-level parallelism means
executing multiple instructions at the same time during one clock cycle.
• The processor reorders and groups instructions.
• Instructions are executed together only if they do not affect each other’s
result.
3. Task Parallelism: Task parallelism means dividing a big task into smaller
subtasks and executing them at the same time.
• Each subtask is assigned to a different processor or core.
• All subtasks run concurrently.
4. Data-level parallelism (DLP) – Data-level parallelism means performing
the same instruction on multiple data items at the same time.
• One instruction works on many data values simultaneously.
• Common in array and vector operations.
GENERAL CLASSIFICATIONS OF COMPUTER ARCHITECTURE
Three computer architectural classification schemes are available:
1. Flynn’s classification (1966) based on the multiplicity of instruction
streams and data streams in a computer system.
2. Feng’s classification (1972) based on the number bits processed in unit
time (serial versus parallel processing).
3. Handler’s classification (1977) based on the degree of parallelism found
in CPU, ALU and bit levels.
FLYNN’S CLASSIFICATION
• Flynn’s Classification is the most popular taxonomy of computer
architecture, proposed by Michael J. Flynn in 1966 based on number of
instructions and data.
• It is based on the notion of a stream of information. Two types of
information flow into a processor: instructions and data.
• The Instruction Stream is defined as the sequence of instructions
executed by the processing unit.
• The Data Stream is defined as the sequence of data including inputs,
partial, or temporary results, called by the instruction stream.
TYPES OF FLYNN’S TAXONOMY
According to Flynn’s classification, either of the instruction or data streams can
be single or multiple. The computer architecture can be classified into four
categories:
• Single-instruction stream Single-data streams (SISD)
• Single-instruction stream Multiple-data streams (SIMD)
• Multiple-instruction stream Single-data streams (MISD)
• Multiple-instruction stream Multiple-data streams (MIMD)
Single-instruction stream Single-data streams (SISD)
• An SISD computing system is a uni-processor machine which is capable
of executing a single instruction, operating on a single data stream.
• Single instruction: only one instruction stream is being acted on by the
CPU during any one clock cycle.
• Single data: only one data stream is being used as input during any one
clock cycle.
• Conventional single-processor Von Neumann computers are classified as
SISD systems.
• It is a serial (non-parallel) computer.
• Instructions are executed sequentially, but may be overlapped in their
execution stages (Pipelining). Most SISD uni-processor systems are
pipelined.
• SISD computers may have more than one functional units, all under the
supervision of control unit.
• Examples: Most PC’s, single CPU workstations, minicomputers,
mainframes. CDC-6600, VAX 11, IBM 7001 are SISD computers.
Single-instruction stream Multiple-data streams (SIMD)
• An SIMD system is capable of executing the same instruction on all the
CPUs but operating on different data streams.
• Single instruction: All processing units execute the same instruction at
any given clock cycle.
• Multiple data: Each processing unit can operate on a different data
element.
• SIMD computer has single control unit which issues one instruction at a
time but it has multiple ALU’s or processing units to carry out on
multiple data sets simultaneously.
• Well suited to scientific computing since they involve lots of vector and
matrix operations.
• Example: Array Processors and Vector Pipelines
• Array processors: ILLIAC-IV, MPP
• Vector Pipelines: IBM 9000, Cray X-MP, Y-MP & C90
Multiple-instruction stream Single-data streams (MISD)
• In MISD, multiple instructions operate on single data stream.
• An MISD computing system is capable of executing different instructions
on different PUs but all of them operating on the same dataset.
• Multiple instruction: Each processing unit may be executing a different
instruction stream.
• Single data: Every processing unit can operate on a same data element.
• Machines built using the MISD model are practically not useful in most
of the application, a few machines are built, but none of them are
available commercially.
• Multiple-instruction stream Multiple-data streams (MIMD)
• An MIMD system is capable of executing multiple instructions on multiple
data sets.
• Multiple Instruction: Every processor may be executing a different
instruction stream.
• Multiple Data: Every processor may be working with a different data
stream.
• MIMD systems are parallel computers capable of processing several
programs simultaneously.
• Can be categorized as loosely coupled or tightly coupled depending on
sharing of data and control.
• Example:
• Most current supercomputers, networked parallel compute “grids” and
multi-processor SMP computers
• Including some types of PCs
• IBM-370, Cray-2, Cray X-MP, [Link], UNIVAC-1100/80
PIPELINING
• What is pipelining?
A way of speeding up execution of instructions
• Key Idea: Break a large computation into smaller pieces and execute
them in an overlapped manner.
Pipelining – Definition
Pipelining is a technique of decomposing a sequential process into sub-
operations, with each sub-operation executed in a separate dedicated segment that
operates concurrently with all other segments.
• Each segment performs partial processing of the task.
• The output of one segment is transferred to the next segment.
• The final result is obtained after data passes through all segments.
Basic Pipelining Concept
• A long operation is divided into smaller stages.
• Each stage requires less time than the original operation.
• Pipeline registers are placed between stages to store intermediate results.
• Once the pipeline is full, one result is produced every clock cycle.
Pipelining Analogy: Industrial Assembly Line
• Pipelining is analogous to an industrial assembly line.
Example divisions:
Manufacturing division
Packing division
Delivery division
Operation:
• While the first product is being packed, the second product is being
manufactured.
• While the first product is delivered, the second is packed and the third is
manufactured.
Product/Time Time 1 Time 2 Time 3 Time 4 Time 5
Manufacturing P1 P2 P3 P4 P5
Packing P1 P2 P3 P4
Delivery P1 P2 P3
Example
Classification of Pipeline Processors
According to the levels of processing, Handler (1977) had proposed the
following classification:
Arithmetic pipeline
Instruction pipeline
Processor pipeline