0% found this document useful (0 votes)
24 views17 pages

Understanding Parallel Processing Techniques

Parallel processing is a technique that allows simultaneous data-processing tasks to enhance computational speed and throughput in computer systems. Pipelining, a related concept, decomposes sequential processes into sub-operations executed concurrently, improving efficiency in arithmetic and instruction execution. The document also discusses the differences between hardwired and micro-programmed control units, highlighting their performance and flexibility in executing instructions.

Uploaded by

Mohit sharma
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views17 pages

Understanding Parallel Processing Techniques

Parallel processing is a technique that allows simultaneous data-processing tasks to enhance computational speed and throughput in computer systems. Pipelining, a related concept, decomposes sequential processes into sub-operations executed concurrently, improving efficiency in arithmetic and instruction execution. The document also discusses the differences between hardwired and micro-programmed control units, highlighting their performance and flexibility in executing instructions.

Uploaded by

Mohit sharma
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Parallel Processing

Parallel processing can be described as a class of techniques which enables the system
to achieve simultaneous data-processing tasks to increase the computational speed of a
computer system.

A parallel processing system can carry out simultaneous data-processing to achieve


faster execution time. For instance, while an instruction is being processed in the ALU
component of the CPU, the next instruction can be read from memory.

The primary purpose of parallel processing is to enhance the computer processing


capability and increase its throughput, i.e. the amount of processing that can be
accomplished during a given interval of time.

A parallel processing system can be achieved by having a multiplicity of functional units


that perform identical or different operations simultaneously. The data can be
distributed among various multiple functional units.

The following diagram shows one possible way of separating the execution unit into
eight functional units operating in parallel.

The operation performed in each functional unit is indicated in each block if the
diagram:
o The adder and integer multiplier performs the arithmetic operation with integer
numbers.
o The floating-point operations are separated into three circuits operating in
parallel.
o The logic, shift, and increment operations can be performed concurrently on
different data. All units are independent of each other, so one number can be
shifted while another number is being incremented.
Pipelining in Computer Architecture
The term Pipelining refers to a technique of decomposing a sequential process into sub-
operations, with each sub-operation being executed in a dedicated segment that
operates concurrently with all other segments.

The most important characteristic of a pipeline technique is that several computations


can be in progress in distinct segments at the same time. The overlapping of
computation is made possible by associating a register with each segment in the
pipeline. The registers provide isolation between each segment so that each can operate
on distinct data simultaneously.

The structure of a pipeline organization can be represented simply by including an input


register for each segment followed by a combinational circuit.

Let us consider an example of combined multiplication and addition operation to get a


better understanding of the pipeline organization.

The combined multiplication and addition operation is done with a stream of numbers
such as:

Ai* Bi + Ci for i = 1, 2, 3, ......., 7


The operation to be performed on the numbers is decomposed into sub-operations
with each sub-operation to be implemented in a segment within a pipeline.

The sub-operations performed in each segment of the pipeline are defined as:

R1 ← Ai, R2 ← Bi Input Ai, and Bi


R3 ← R1 * R2, R4 ← Ci Multiply, and input Ci
R5 ← R3 + R4 Add Ci to product
The following block diagram represents the combined as well as the sub-operations
performed in each segment of the pipeline.
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a
particular segment.

The output generated by the combinational circuit in a given segment is applied as an


input register of the next segment. For instance, from the block diagram, we can see
that the register R3 is used as one of the input registers for the combinational adder
circuit.

In general, the pipeline organization is applicable for two areas of computer design
which includes:

1. Arithmetic Pipeline
2. Instruction Pipeline

We will discuss both of them in our later sections.


We are going to discuss these below briefly for a general idea.

1. Arithmetic Pipeline:
An arithmetic pipeline is a technologically shaped processing pipeline designed to
accelerate the implementation of arithmetic operations. It's an ideal part of the general
processor figure, particularly specializing in improving the overall performance of
mathematical computations.

Components

o Addition Stage: In this stage, the pipeline plays the addition operation. It's a crucial
mathematical operation and is frequently broken down into sub-parts for efficient
processing.
o Multiplication Stage: For more complicated mathematics operations, which encompass
multiplication, an intense level is covered in the pipeline. Multiplication consists of a
sequence of partial products, and an arithmetic pipeline can simplify this device.
o Division Stage: Division is any other arithmetic operation that can take advantage of
pipelining. Dividing various involves more than one step, and breaking down the
approach into pipeline ranges can decorate the general tempo of execution.
Advantages:

o Parallelism in Arithmetic Operations: Arithmetic pipelines take advantage of


parallelism by breaking down complex operations into small parts. This allows the
concurrent execution of a couple of arithmetic operations, considerably enhancing
throughput.
o Optimized Resource Utilization: The pipeline structure allows for the best usage of
processing resources. While one arithmetic operation is within the multiplication stage,
every other can be within the addition stage, maximizing the performance of the
processor.
o Enhanced Computational Speed: By dividing arithmetic operations into smaller,
feasible phases, the overall pace of computation is expanded. This is mainly critical in
programs in which mathematical calculations are a large element, which includes medical
computing or photograph processing.

2. Instruction Pipeline:
An Instruction Pipeline is a key component of a processor's structure designed to
facilitate the concurrent execution of a couple of commands. It breaks down the
execution of instructions into different phases, allowing one-of-a-type spans to function
simultaneously on unique instructions.

Components:
o Instruction Fetch (IF): The first stage entails fetching the instruction from memory. The
software program counter is used to decide the address of the following approach.
o Instruction Decode (ID): In this phase, the fetched instruction is decoded to determine
the operation to be completed and to understand the operands involved.
o Execution (EX): The actual computation or operation through the instruction takes place
in this stage. It might also additionally contain mathematics or logical operations.
o Memory Access (MEM): If instruction requires access to memory, this stage is wherein
data is analyzed from or written to memory.
o Write Back (WB): The final phase includes registering the results once more to report or
memory and finishing the execution of these.
Advantages:

o Improved Throughput: The instruction pipeline allows for a continuous drift of


commands through the processor, enhancing the usual throughput. While one
instruction is within the execution phase, every other may be within the decoding phase,
resulting in better resource utilization.
o Faster Program Execution: By overlapping the execution of instructions, the time taken
to execute a series of commands is reduced. This outcome in faster software execution is
a vital element in enhancing the general performance of a PC system.
o Effective Resource Management: Instructional pipelining allows powerful manipulation
of sources by permitting tremendous levels of the pipeline to operate concurrently. This
contributes to a good and streamlined execution of commands.
Arithmetic Pipeline
Arithmetic Pipelines are mostly used in high-speed computers. They are used to
implement floating-point operations, multiplication of fixed-point numbers, and similar
computations encountered in scientific problems.

To understand the concepts of arithmetic pipeline in a more convenient way, let us


consider an example of a pipeline unit for floating-point addition and subtraction.

The inputs to the floating-point adder pipeline are two normalized floating-point binary
numbers defined as:

X = A * 2a = 0.9504 * 103
Y = B * 2b = 0.8200 * 102
Where A and B are two fractions that represent the mantissa and a and b are the
exponents.

The combined operation of floating-point addition and subtraction is divided into four
segments. Each segment contains the corresponding suboperation to be performed in
the given pipeline. The suboperations that are shown in the four segments are:

1. Compare the exponents by subtraction.


2. Align the mantissas.
3. Add or subtract the mantissas.
4. Normalize the result.

We will discuss each suboperation in a more detailed manner later in this section.

The following block diagram represents the suboperations performed in each segment
of the pipeline.
Note: Registers are placed after each suboperation to store the intermediate results.

1. Compare exponents by subtraction:


The exponents are compared by subtracting them to determine their difference. The
larger exponent is chosen as the exponent of the result.

The difference of the exponents, i.e., 3 - 2 = 1 determines how many times the mantissa
associated with the smaller exponent must be shifted to the right.

2. Align the mantissas:


The mantissa associated with the smaller exponent is shifted according to the difference
of exponents determined in segment one.

X = 0.9504 * 103
Y = 0.08200 * 103

3. Add mantissas:
The two mantissas are added in segment three.

Z = X + Y = 1.0324 * 103

4. Normalize the result:


After normalization, the result is written as:

Z = 0.1324 * 104

Instruction Pipeline
Pipeline processing can occur not only in the data stream but in the instruction stream
as well.

Most of the digital computers with complex instructions require instruction pipeline to
carry out operations like fetch, decode and execute instructions.

In general, the computer needs to process each instruction with the following sequence
of steps.

1. Fetch instruction from memory.


2. Decode the instruction.
3. Calculate the effective address.
4. Fetch the operands from memory.
5. Execute the instruction.
6. Store the result in the proper place.

Each step is executed in a particular segment, and there are times when different
segments may take different times to operate on the incoming information. Moreover,
there are times when two or more segments may require memory access at the same
time, causing one segment to wait until another is finished with the memory.

The organization of an instruction pipeline will be more efficient if the instruction cycle
is divided into segments of equal duration. One of the most common examples of this
type of organization is a Four-segment instruction pipeline.

A four-segment instruction pipeline combines two or more different segments and


makes it as a single one. For instance, the decoding of the instruction can be combined
with the calculation of the effective address into one segment.

The following block diagram shows a typical example of a four-segment instruction


pipeline. The instruction cycle is completed in four segments.
Segment 1:
The instruction fetch segment can be implemented using first in, first out (FIFO) buffer.

Segment 2:
The instruction fetched from memory is decoded in the second segment, and eventually,
the effective address is calculated in a separate arithmetic circuit.

Segment 3:
An operand from memory is fetched in the third segment.

Segment 4:
The instructions are finally executed in the last segment of the pipeline organization.
Half - Adder
A Half-adder circuit needs two binary inputs and two binary outputs. The input variable
shows the augend and addend bits whereas the output variable produces the sum and
carry. We can understand the function of a half-adder by formulating a truth table. The
truth table for a half-adder is:

o 'x' and 'y' are the two inputs, and S (Sum) and C (Carry) are the two outputs.
o The Carry output is '0' unless both the inputs are 1.
o 'S' represents the least significant bit of the sum.
The simplified sum of products (SOP) expressions is:

S = x'y+xy', C = xy

The logic diagram for a half-adder circuit can be represented as:


Full - Adder
This circuit needs three binary inputs and two binary outputs. The truth table for a full-
adder is:

o Two of the input variable 'x' and 'y', represent the two significant bits to be
added.
o The third input variable 'z', represents the carry from the previous lower
significant position.
o The outputs are designated by the symbol 'S' for sum and 'C' for carry.
o The eight rows under the input variables designate all possible combinations of
0's, and 1's that these variables may have.
o The input-output logical relationship of the full-adder circuit may be expressed in
two Boolean functions, one for each output variable.
o Each output Boolean function can be simplified by using a unique map method.
Maps for a full-adder:
The logic diagram for a full-adder circuit can be represented as:
Difference between Hardwired and Micro-
programmed Control Unit | Set 2

Prerequisite - Hardwired v/s Micro-programmed Control Unit To execute an
instruction, there are two types of control units Hardwired Control unit and Micro-
programmed control unit.
1. Hardwired control units are generally faster than microprogrammed designs.
In hardwired control, we saw how all the control signals required inside the
CPU can be generated using a state counter and a PLA circuit.
2. A microprogrammed control unit is a relatively simple logic circuit that is
capable of (1) sequencing through microinstructions and (2) generating
control signals to execute each microinstruction.
The control unit’s implementation, whether hardwired or micro-programmed,
affects the performance and flexibility of the CPU.
Hardwired Control Unit Microprogrammed Control Unit

Hardwired control unit generates the control Microprogrammed control unit generates the
signals needed for the processor using logic control signals with the help of micro
circuits instructions stored in control memory

Hardwired control unit is faster when compared


This is slower than the other as micro
to microprogrammed control unit as the
instructions are used for generating signals
required control signals are generated with the
here
help of hardwares

Difficult to modify as the control signals that Easy to modify as the modification need to
need to be generated are hard wired be done only at the instruction level

Less costlier than hardwired control as only


More costlier as everything has to be realized in
micro instructions are used for generating
terms of logic gates
control signals

It cannot handle complex instructions as the


It can handle complex instructions
circuit design for it becomes complex

Only limited number of instructions are used Control signals for many instructions can be
due to the hardware implementation generated
Hardwired Control Unit Microprogrammed Control Unit

Used in computer that makes use of Reduced Used in computer that makes use of
Instruction Set Computers(RISC) Complex Instruction Set Computers(CISC)

You might also like