0% found this document useful (0 votes)

12 views13 pages

Multi-Processor Systems and Parallel Processing

The document discusses multi-processor and parallel processing in computer organization, highlighting the evolution from sequential to parallel processing and categorizing systems based on Flynn's taxonomy. It describes various architectures such as SIMD, MISD, and MIMD, with a focus on symmetric multiprocessors (SMP) and non-uniform memory access (NUMA) systems, along with their design complexities and advantages. Additionally, it covers interconnection networks like single bus, crossbar, multistage, and hypercube networks, emphasizing their roles in facilitating communication between processors.

Uploaded by

palm131719

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views13 pages

Multi-Processor Systems and Parallel Processing

Uploaded by

palm131719

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

COMPUTER ORGANIZATION

Multi-Processor / Parallel Processing

Parallel Processing :

Originally, the computer has been viewed as a sequential machine. Most computer programming
languages require the programmer to specify algorithms as sequence of instruction.

Processor executes programs by executing machine instructions in a sequence and one at a time.

Each instruction is executed in a sequence of operations (fetch instruction, fetch operands, perform
operation store result.)

It is observed that, at the micro operation level, multiple control signals are generated at the same time.

Instruction pipelining, at least to the extent of overlapping fetch and execute operations, has been around
for long time.

By looking into these phenomenons, researcher has look into the matter whether some operations can be
performed in parallel or not.

As computer technology has evolved, and as the cost of computer hardware has dropped, computer
designers have sought more and more opportunities for parallelism, usual to enhance performance and, in
some cases, to increase availability.

The taxonomy first introduced by Flynn is still the most common way of categorizing systems with
parallel processing capability. Flynn proposed the following categories of computer system:

 Single Instruction, Multiple Data (SIMD) system: A single machine instruction controls the
simultaneous execution of a number of processing elements on a lockstep basis. Each processing
element has an associated data memory, so that each instruction is executed on a different set of
data by the different processors. Vector and array processors fall into this category
 Multiple Instruction, Single Data (MISD) system A sequence of data is transmitted to a set of
processors, each of which executes a different instruction sequence. This structure has never been
implemented.
 Multiple Instruction, Multiple Data (MIMD) system: A set of processors simultaneously execute
different instruction sequences on different data sets. SMPs, clusters, and NUMA systems fits
into this category.

With the MIMD organization, the processors are general purpose; each is able to process all of the
instructions necessary to perform the appropriate data transformation.

Further MIMD can be subdivided into two main categories:

 Symmetric multiprocessor (SMP): In an SMP, multiple processors share a single memory or a

pool of memory by means of a shared bus or other interconnection mechanism. A distinguish
feature is that the memory access time to any region of memory is approximately the same for
each processor.
 Nonuniform memory access (NUMA): The memory access time to different regions of memory
may differ for a NUMA processor.

[Link] Kumar, CSE dept, AITAM 1

COMPUTER ORGANIZATION

The design issues relating to SMPs and NUMA are complex, involving issues relating to physical
organization, interconnection structures, inter processor communication, operating system design, and
application software techniques.

Symmetric Multiprocessors:

A symmetric multiprocessor (SMP) can be defined as a standalone computer system with the following
characteristic:

1. There are two or more similar processor of comparable capability.

2. These processors share the same main memory and I/O facilities and are interconnected by a bus
or other internal connection scheme.
3. All processors share access to I/O devices, either through the same channels or through different
channels that provide paths to the same device.
4. All processors can perform the same functions.
5. The system is controlled by an integrated operating system that provides interaction between
processors and their programs at the job, task, file and data element levels.

The operating system of a SMP schedules processors or thread across all of the processors. SMP has a
potential advantages over uniprocessor architecture:

 Performance: A system with multiple processors will perform in a better way than one with a
single processor of the same type if the task can be organized in such a manner that some portion
of the work done can be done in parallel.
 Availability: Since all the processors can perform the same function in a symmetric
multiprocessor, the failure of a single processor does not stop the machine. Instead, the system
can continue to function at reduce performance level.
 Incremental growth: A user can enhance the performance of a system by adding an additional
processor.
 Sealing: Vendors can offer a range of product with different price and performance characteristics
based on number of processors configured in the system.

Organization: The organization of a multiprocessor system is shown in Figure 10.1

Figure 10.1: Block diagram of tightly coupled multiprocessors

[Link] Kumar, CSE dept, AITAM 2

COMPUTER ORGANIZATION

 There are two or more processors. Each processor is self sufficient, including a control unit,
ALU, registers and cache.
 Each processor has access to a shared main memory and the I/O devices through an
interconnection network.
 The processor can communicate with each other through memory (messages and status
information left in common data areas).
 It may also be possible for processors to exchange signal directly.
 The memory is often organized so that multiple simultaneous accesses to separate blocks of
memory are possible.
 In some configurations each processor may also have its own private main memory and I/O
channels in addition to the shared resources.

The organization of multiprocessor system can be classified as follows:

 Time shared or common bus

 Multiport memory
 Central control unit.

Time shared Bus:

Time shared bus is the simplest mechanism for constructing a multiprocessor system. The bus consists of
control, address and data lines. The block diagram is shown in Figure 10.2.

Figure 10.2: Time shared bus

The following features are provided in time-shared bus organization:

 Addressing: It must be possible to distinguish modules on the bus to determine the source and
destination of data
 Arbitration: Any I/O module can temporarily function as “master”. A mechanism is provided to
arbitrate competing request for bus control, using some sort of priority scheme.
 Time shearing: when one module is controlling the bus, other modules are locked out and if
necessary suspend operation until bus access in achieved.

[Link] Kumar, CSE dept, AITAM 3

COMPUTER ORGANIZATION

The bus organization has several advantages compared with other approaches:

 Simplicity: This is the simplest approach to multiprocessor organization. The physical interface
and the addressing, arbitration and time sharing logic of each processor remain the same as in a
single processor system.
 Flexibility: It is generally easy to expand the system by attaching more processor to the bus.
 Reliability: The bus is essentially a passive medium and the failure of any attached device should
not cause failure of the whole system.

The main drawback to the bus organization is performance. Thus, the speed of the system is limited by
the bus cycle time.

To improve performance, each processor can be equipped with local cache memory.

The use of cache leads to a new problem which is known as cache coherence problem. Each local cache
contains an image of a portion of main memory. If a word is altered in one cache, it may invalidate a
word in another cache. To prevent this, the other processors must perform an update in its local cache.

Multiport Memory:

The multiport memory approach allows the direct, independent access of main memory modules by each
processor and io module.
The multiport memory system is shown in Figure 10.3

Figure 10.3: Multiport memory

The multiport memory approach is more complex than the bus approach, requiring a fair amount of logic
to be added to the memory system. Logic associated with memory is required for resolving conflict. The
method often used to resolve conflicts is to assign permanently designated priorities to each memory port.

[Link] Kumar, CSE dept, AITAM 4

COMPUTER ORGANIZATION

Non-uniform Memory Access(NUMA)

In NUMA architecture, all processors have access to all parts of main memory using loads and stores. The
memory access time of a processor differs depending on which region of main memory is accessed. The
last statement is true for all processors; however, for different processors, which memory regions are
slower and which are faster differ.

A NUMA system in which cache coherence is maintained among the cache of the various processors is
known as cache-cohence NUMA (CC-NUMA)

A typical CC-NUMA organization is shown in the Figure 10.4.

Figure 10.4: CC- NUMA Organization

There are multiple independent nodes, each of which is, in effect, an SMP organization.

Each node contains multiple processors, each with its own L1 and L2 caches, plus main memory.

The node is the basic building block of the overall CC NUMA organization

The nodes are interconnected by means of some communication facility, which could be a switching
mechanism, a ring, or some other networking facility.

Each node in the CC-NUMA system includes some main memory.

From the point of view of the processors, there is only a single addressable memory, with each location
having a unique system-wide address.

When a processor initiates a memory access, if the requested memory location is not in the processors
cache, then the L2 cache initiates a fetch operation.

[Link] Kumar, CSE dept, AITAM 5

COMPUTER ORGANIZATION

If the desired line is in the local portion of the main memory, the line is fetch across the local bus.

If the desired line is in a remote portion of the main memory, then an automatic request is send out to
fetch that line across the interconnection network, deliver it to the local bus, and then deliver it to the
requesting cache on that bus.

All of this activity is atomic and transparent to the processors and its cache.

In this configuration, cache coherence is a central concern. For that each node must maintain some sort of
directory that gives it an indication of the location of various portion of memory and also cache status
information.

Interconnection Networks:

In a multiprocessor system, the interconnection network must allow information transfer between any pair
of modules in the system. The traffic in the network consists of requests (such as read and write), data
transfers, and various commands.

Single Bus:

The simplest and most economical means of interconnecting a number of modules is to use a single bus.

Since several modules are connected to the bus and any module can request a data transfer at any time, it
is essential to have an efficient bus arbitration scheme.

In a simple mode of operation, the bus is dedicated to a particular source-destination pair for the full
duration of the requested transfer. For example, when a processor uses a read request on the bus, it holds
the bus until it receives the desired data from the memory module.

Since the memory module needs a certain amount of time to access the data bus, the bus will be idle until
the memory is ready to respond with the data.

Then the data is transferred to the processors. When this transfer is completed, the bus can be assigned to
handle another request.

A scheme known as the split- transaction protocol makes it possible to use the bus during the idle period
to serve another request.

Consider the following method of handling a series of read requests possibly from different processor.

After transferring the address involved in the first request, the bus may be reassigned to transfer the
address of the second request; assuming that this request is to a different memory module.

At this point, we have two modules proceeding with read access cycle in parallel.

If neither module has finished with its access, the bus may be reassigned to a third request and so on.

Eventually, the first memory module completes its access cycle and uses the bus to transfer the data to the
corresponding source.

[Link] Kumar, CSE dept, AITAM 6

COMPUTER ORGANIZATION

As other modules complete their cycles, the bus is needed to transfer their data to the corresponding
sources.

The split transaction protocol allows the bus and the available bandwidth to be used more efficiently. The
performance improvement achieved with this protocol depends on the relationship between the bus
transfer time and the memory access time.

In split- transaction protocol, performance is improved at the cost of increased bus complexity.

There are two reasons why complexity increases:

 Since a memory module needs to know which source initiated a given read
request,
a source identification tag must be attached to the request.
 Complexity also increases because all modules, not just the processor, must be
able to act as bus muster.

The main limitation of a single bus is that the number of modules that can be connected to the bus is not
that large. Networks that allow multiple independent transfer operations to proceed in parallel can provide
significantly increased data transfer rate.

Crossbar Network:

Crossbar switch is a versatile switching network. It is basically a network of switches. Any module can
be connected to any other module by closing the appropriate switch. Such networks, where there is a
direct link between all pairs of nodes are called fully connected networks.

In a fully connected network, many simultaneous transfers are possible. If n sources need to send data to
n distinct destinations then all of these transfers can take place concurrently. Since no transfer is
prevented by the lack of a communication path, the crossbar is called a nonblocking switch.

In the Figure 10.5 of crossbar interconnection network, a single switch is shown at each cross point. In
actual multiprocessor system, the paths through the crossbar network are much wider.

Figure 10.5: Crossbar Interconnection Network

[Link] Kumar, CSE dept, AITAM 7

COMPUTER ORGANIZATION

If there are modules in a network, than the number of cross point is in a network to interconnect
modules. The total number of switches becomes large as increases.

In a crossbar switch, conflicts occur when two or more concurrent requests are made to the same
destination device. These conflicting requests are usually handled on a predetermined priority basis.

The crossbar switch has the potential for the highest bandwidth and system efficiency. However, because
of its complexity and cost, it may be cost effective for a large multiprocessor system.

Multistage Network:

The bus and crossbar systems use a single stage of switching to provide a path from a source to a
destination.

In multistage network, multiple stages of switches are used to setup a path between source and
destination.

Such networks are less costly than the crossbar structure, yet they provide a reasonably large number of
parallel paths between source and destinations.

In the Figure 10.6, it shows a three-stage network that called a shuffle network that interconnects eight
modules.

The term "shuffle" describes the pattern of connections from the outputs of one stage to the inputs of the
next stage.

The switchbox in the Figure 10.6 is a switch that can route either input to either output.

If the inputs request distinct outputs, they can both be routed simultaneously in the straight through or
crossed pattern.

If both inputs request the same output, only one request can be satisfied. The other one is blocked until
the first request finishes using the switch.

[Link] Kumar, CSE dept, AITAM 8

COMPUTER ORGANIZATION

Figure 10.6: Multistage Shuffle Network

A network consisting of stages can be used to interconnect modules. In this case, there is exactly one
path through the network from any module to any module . Therefore, this network provides full
connectivity between sources and destinations.

Many request patterns cannot be satisfied simultaneously. For example, the connection from P2 to P7 can
not be provided at the same time as the connection from P3 to P6.

A multistage network is less expansive to implement than a crossbar network. If nodes are to be
interconnected using this scheme, then we must use stages with switches per stage. Since
each switches contains four switches, the total number of switches is

which, for a large network, is considerably less than the switches needed in a crossbar network.

Multistage networks are less capable of providing concurrent connection than crossbar switches. The
connection path between and is indicated by RED lines in the Figure 10.6.

Hypercube Networks:

A hypercube is an -dimensional cube that interconnects nodes. In addition to the communication

circuit, each node usually includes a processor and a memory module as well as some I/O capability.

The Figure 10.7 shows a three dimensional hypercube. The small circles represent the communication
circuits in the nodes. The edge of the cube represent bi-directional communication links between
neighboring nodes.

[Link] Kumar, CSE dept, AITAM 9

COMPUTER ORGANIZATION

In an an -dimensional hypercube each node is directly connected to neighbors. A useful way to label
the nodes is to assign binary addresses to them in such a way that the addresses of any two neighbors
differs in exactly one bit position. The functional units are attached to each node of the hypercube.

Figure 10.7: A 3-dimensional Hypercube Network

Routing messages through the hypercube is easy. If the processor at node wishes to send a message to
node , it proceeds as follows:

 The binary addresses of the source, , and the destination, , are compared from least to most
significant bits.
 Suppose that they differ first in position

 Node then sends the message to its neighbor whose address, , differs from in bit position

 Node forwards the message to the appropriate neighbor using the same address comparison
scheme
 The message gets closer to destination node with each of these hops from one node to
another.
 For example, a message from node to transverse the following way:

Message traverses from node to N1, they differ in 1st bit position.

Then message traverses from to , they differ in 3 rd bit position.

Therefore, it takes two hops. The maximum distance that any message needs to travel in an -
dimensional hypercube is - hops.

[Link] Kumar, CSE dept, AITAM 10

COMPUTER ORGANIZATION

Mesh networks:

Mesh network is another way to interconnect a large number of nodes in a multiprocessor system. An
example of a mesh with 16 nodes is given in the Figure 10.8.

Figure 10.8: A 2-D Mesh Network

The link between the nodes are bi-directional. The functional unit are attached to the each node of the
mesh network. Routing in a mesh network can be done in several ways.

One of the simplest and most effective possibilities is to choose the path between a source node and a
destination node such that the transfer first takes place in the horizontal directional from towards
.

When the column in which resides is reached, the transfer proceeds in the vertical direction along this
column. If a wraparound connection is made between the nodes at the opposite edges of a mesh network,
the result is a network that comprises a set of bi-directional rings in the direction connected by a set of
rings in the direction.

This network is called a torus. The average latency of information transfer is reduced in a torus, but the
complexity increases.

Tree Networks:

A hierarchically structured network implemented in the form of a tree is another interconnection

topology. A four way tree that interconnects 16 modules is shown in the Figure 10.9.

In this tree, each parent node allows communication between two of its children at a time.

An intermediate-level node, for example node in the figure, can provide a connection from one of its
child node to its parent.

[Link] Kumar, CSE dept, AITAM 11

COMPUTER ORGANIZATION

This enables two leaf nodes that are any distance apart to communicate.

Only one path at any time can be established through a given node in the tree.

Figure 10.9: A four way Tree Network

To reduce the possibility of a bottleneck, the number of links in the upper levels of a tree hierarchy can be
increased. This is done in a fat tree network, in which each node in the tree (except at the top level) has
more than one parent.

The Figure 10.10 shows a fat tree in which each node has two parents.

Figure 10.10: A Fat Tree

One of the simplest network topologies uses a ring to interconnect the nodes in the system. A single ring
is shown in the Figure 10.11.

[Link] Kumar, CSE dept, AITAM 12

COMPUTER ORGANIZATION

Figure 10.11: A Single Ring

The main advantage of the arrangement is that the ring is easy to implement. Links in the ring can be
wide, because each node is connected to only two neighbors. It is not useful to construct a very long ring
to connect many nodes because the latency of information transfer would be unacceptably large

The simple possibility of using ring in a tree structure; this results in a hierarchy of rings. A hierarchy of
rings is shown in the Figure 10.12.

Having short rings reduces substantially the latency of transfers that involve nodes on the same ring.

The latency of transfers between two nodes on different rings is shorter than if a single ring were used.

Figure 10.12: Hierarchy of Rings

[Link] Kumar, CSE dept, AITAM 13

Understanding Multi-Processor Systems
No ratings yet
Understanding Multi-Processor Systems
12 pages
SIMD and Parallel Processing Overview
No ratings yet
SIMD and Parallel Processing Overview
12 pages
Understanding Multi-Processor Systems
No ratings yet
Understanding Multi-Processor Systems
12 pages
Overview of Parallel Processor Types
No ratings yet
Overview of Parallel Processor Types
24 pages
Characteristics of Symmetric Multiprocessing
No ratings yet
Characteristics of Symmetric Multiprocessing
11 pages
Parallel Processing in Multicore Systems
No ratings yet
Parallel Processing in Multicore Systems
64 pages
Multiprocessor Systems Overview
No ratings yet
Multiprocessor Systems Overview
21 pages
Parallel Processor Organizations Overview
No ratings yet
Parallel Processor Organizations Overview
32 pages
Parallel Processing Architectures Explained
No ratings yet
Parallel Processing Architectures Explained
28 pages
Overview of Parallel Processing Techniques
No ratings yet
Overview of Parallel Processing Techniques
48 pages
Module III
No ratings yet
Module III
20 pages
CH17-COA10e - Parallel Processing
No ratings yet
CH17-COA10e - Parallel Processing
45 pages
Understanding Parallel Processing Systems
No ratings yet
Understanding Parallel Processing Systems
30 pages
Understanding Parallel Processor Architectures
No ratings yet
Understanding Parallel Processor Architectures
117 pages
Understanding Multiprocessor Architecture
No ratings yet
Understanding Multiprocessor Architecture
22 pages
Types of Parallel Processor Systems
No ratings yet
Types of Parallel Processor Systems
52 pages
Parallel Processing Architectures Overview
No ratings yet
Parallel Processing Architectures Overview
40 pages
Overview of Multiprocessor Systems
No ratings yet
Overview of Multiprocessor Systems
50 pages
Understanding Multiprocessor Operating Systems
No ratings yet
Understanding Multiprocessor Operating Systems
35 pages
Parallel Processor Architectures Overview
No ratings yet
Parallel Processor Architectures Overview
40 pages
Characteristics of Multiprocessor Systems
No ratings yet
Characteristics of Multiprocessor Systems
7 pages
Understanding Parallel Organization Systems
No ratings yet
Understanding Parallel Organization Systems
32 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
29 pages
Overview of Parallel Processing Units
No ratings yet
Overview of Parallel Processing Units
47 pages
Shared vs. Distributed Memory Systems
No ratings yet
Shared vs. Distributed Memory Systems
28 pages
Overview of Multicore Processors
No ratings yet
Overview of Multicore Processors
51 pages
SMP Cache and Memory Access Overview
No ratings yet
SMP Cache and Memory Access Overview
56 pages
Multiprocessor Systems Overview
No ratings yet
Multiprocessor Systems Overview
6 pages
Parallel Processing
No ratings yet
Parallel Processing
16 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
51 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
51 pages
Characteristics and Structure of Multiprocessors
No ratings yet
Characteristics and Structure of Multiprocessors
52 pages
Uniprocessor vs Multiprocessor Systems
No ratings yet
Uniprocessor vs Multiprocessor Systems
15 pages
Section D
No ratings yet
Section D
8 pages
Asymmetric vs Symmetric Multiprocessing
No ratings yet
Asymmetric vs Symmetric Multiprocessing
61 pages
Understanding Multiprocessor Systems
No ratings yet
Understanding Multiprocessor Systems
96 pages
Lec 3
No ratings yet
Lec 3
5 pages
Interprocessor Communication & Synchronization
No ratings yet
Interprocessor Communication & Synchronization
23 pages
Unit6 - Microprocessor - Final 1
No ratings yet
Unit6 - Microprocessor - Final 1
30 pages
Parallel Processing in SMP Systems
No ratings yet
Parallel Processing in SMP Systems
32 pages
Differences Between Multiprocessors and Multicomputers
No ratings yet
Differences Between Multiprocessors and Multicomputers
49 pages
Multiprocessor vs. Multicomputer Systems
No ratings yet
Multiprocessor vs. Multicomputer Systems
52 pages
Overview of Multiprocessor Systems
No ratings yet
Overview of Multiprocessor Systems
36 pages
Memory Organization
No ratings yet
Memory Organization
17 pages
Characteristics of Multiprocessor Systems
No ratings yet
Characteristics of Multiprocessor Systems
14 pages
Multiprocessor System Overview and Features
No ratings yet
Multiprocessor System Overview and Features
107 pages
Parallel Processing Techniques Explained
No ratings yet
Parallel Processing Techniques Explained
50 pages
Flynn's Classification of Multiprocessor Systems
No ratings yet
Flynn's Classification of Multiprocessor Systems
87 pages
Computer Architecture and Pipelining Insights
No ratings yet
Computer Architecture and Pipelining Insights
15 pages
Understanding Parallel Processing Techniques
No ratings yet
Understanding Parallel Processing Techniques
12 pages
Parallel Processor Architectures Explained
No ratings yet
Parallel Processor Architectures Explained
47 pages
Parallel Processing & Multicore Systems
No ratings yet
Parallel Processing & Multicore Systems
16 pages
Computer Organization Overview
No ratings yet
Computer Organization Overview
118 pages
Parallel Computer Models Overview
No ratings yet
Parallel Computer Models Overview
91 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
32 pages
Parallel Processing in Computer Architecture
No ratings yet
Parallel Processing in Computer Architecture
32 pages
Understanding Parallel Processing Systems
No ratings yet
Understanding Parallel Processing Systems
37 pages
Digital System Design Lab Experiments
No ratings yet
Digital System Design Lab Experiments
47 pages
AI's Impact on HR Recruitment at Cisco
No ratings yet
AI's Impact on HR Recruitment at Cisco
67 pages
Agriconnect Dbms Project
No ratings yet
Agriconnect Dbms Project
27 pages
GTU Java Programs for Exams Guide
No ratings yet
GTU Java Programs for Exams Guide
16 pages
Acknowledgments by Stephen G. Simpson
No ratings yet
Acknowledgments by Stephen G. Simpson
2 pages
Firewall Selection Guidelines
No ratings yet
Firewall Selection Guidelines
5 pages
TeamViewer: Features and Benefits Overview
No ratings yet
TeamViewer: Features and Benefits Overview
22 pages
Digital Revolution's Effect on Knowledge Management
No ratings yet
Digital Revolution's Effect on Knowledge Management
15 pages
Security Consultant Resume - IAM Specialist
No ratings yet
Security Consultant Resume - IAM Specialist
3 pages
Class 11 Python Functions Overview
No ratings yet
Class 11 Python Functions Overview
14 pages
English For Starters 1 Activity Book 2014
No ratings yet
English For Starters 1 Activity Book 2014
96 pages
Understanding SOP Form in Boolean Algebra
No ratings yet
Understanding SOP Form in Boolean Algebra
13 pages
G630-U10 Simplified Maintenance Manual PDF
No ratings yet
G630-U10 Simplified Maintenance Manual PDF
63 pages
Exploring Windows Explorer Components
0% (1)
Exploring Windows Explorer Components
5 pages
Why Linux Kernel Lacks a Main Function
No ratings yet
Why Linux Kernel Lacks a Main Function
2 pages
KRAL Bem 150
No ratings yet
KRAL Bem 150
24 pages
Grade 7 ICT MS Excel Lab Revision 1
No ratings yet
Grade 7 ICT MS Excel Lab Revision 1
3 pages
Game Center Initialization Log Analysis
No ratings yet
Game Center Initialization Log Analysis
36 pages
Call Drop Analysis for Delhi Network
No ratings yet
Call Drop Analysis for Delhi Network
9 pages
8 Reconfiguration Strategies
No ratings yet
8 Reconfiguration Strategies
24 pages
Setting Up SQL Source Control Guide
No ratings yet
Setting Up SQL Source Control Guide
19 pages
First Integrated Circuits and Inventors
No ratings yet
First Integrated Circuits and Inventors
16 pages
Supplemental Nights Vol. 3 by Burton
No ratings yet
Supplemental Nights Vol. 3 by Burton
357 pages
Data Handling Basics in Python
No ratings yet
Data Handling Basics in Python
18 pages
CE7G-25 Question Paper Analysis
No ratings yet
CE7G-25 Question Paper Analysis
21 pages
Automation Questions
No ratings yet
Automation Questions
4 pages
Haermes Presentation201404
No ratings yet
Haermes Presentation201404
49 pages
OCI Foundations Exam 1z0-1085-20 Guide
75% (4)
OCI Foundations Exam 1z0-1085-20 Guide
16 pages
Star Topology LAN Network Setup Guide
No ratings yet
Star Topology LAN Network Setup Guide
27 pages
ARM Processor Structure Overview
No ratings yet
ARM Processor Structure Overview
74 pages

Multi-Processor Systems and Parallel Processing

Uploaded by

Multi-Processor Systems and Parallel Processing

Uploaded by

COMPUTER ORGANIZATION

Multi-Processor / Parallel Processing

Further MIMD can be subdivided into two main categories:

 Symmetric multiprocessor (SMP): In an SMP, multiple processors share a single memory or a

[Link] Kumar, CSE dept, AITAM 1

1. There are two or more similar processor of comparable capability.

Organization: The organization of a multiprocessor system is shown in Figure 10.1

Figure 10.1: Block diagram of tightly coupled multiprocessors

[Link] Kumar, CSE dept, AITAM 2

The organization of multiprocessor system can be classified as follows:

 Time shared or common bus

Time shared Bus:

Figure 10.2: Time shared bus

The following features are provided in time-shared bus organization:

[Link] Kumar, CSE dept, AITAM 3

Figure 10.3: Multiport memory

[Link] Kumar, CSE dept, AITAM 4

Non-uniform Memory Access(NUMA)

A typical CC-NUMA organization is shown in the Figure 10.4.

Figure 10.4: CC- NUMA Organization

Each node in the CC-NUMA system includes some main memory.

[Link] Kumar, CSE dept, AITAM 5

[Link] Kumar, CSE dept, AITAM 6

There are two reasons why complexity increases:

Figure 10.5: Crossbar Interconnection Network

[Link] Kumar, CSE dept, AITAM 7

[Link] Kumar, CSE dept, AITAM 8

Figure 10.6: Multistage Shuffle Network

A hypercube is an -dimensional cube that interconnects nodes. In addition to the communication

[Link] Kumar, CSE dept, AITAM 9

Figure 10.7: A 3-dimensional Hypercube Network

Then message traverses from to , they differ in 3 rd bit position.

[Link] Kumar, CSE dept, AITAM 10

Figure 10.8: A 2-D Mesh Network

A hierarchically structured network implemented in the form of a tree is another interconnection

[Link] Kumar, CSE dept, AITAM 11

Figure 10.9: A four way Tree Network

Figure 10.10: A Fat Tree

[Link] Kumar, CSE dept, AITAM 12

Figure 10.11: A Single Ring

Figure 10.12: Hierarchy of Rings

[Link] Kumar, CSE dept, AITAM 13

You might also like