Modern classification (Sima, Fountain, Kacsuk)
The modern classification proposed by Sima, Fountain, and Kacsuk focuses on how
parallelism is achieved in computer architectures. Here are the key categories:
▪ Data Parallelism: In this approach, the same function operates on many data elements
simultaneously. It
emphasizes parallelism at the data level.
▪ Function Parallelism: Multiple functions are performed in parallel. This category
emphasizes parallelism at
the functional level.
▪ Control Parallelism: It involves task parallelism, where different tasks are executed
concurrently based on
control flow.
These classifications help us understand the design spaces of advanced architectures.
1. Data Parallelism
Definition:
The same operation is performed on different pieces of data simultaneously by multiple
processors.
Example: Image Processing
• Suppose we want to increase brightness of a large image.
• The image is divided into many parts.
• Each processor works on a different part of the image but performs the same
operation (increase pixel value).
Example:
• Processor 1 → pixels 1–1000
• Processor 2 → pixels 1001–2000
• Processor 3 → pixels 2001–3000
All processors perform the same instruction on different data blocks. Another example
Matrix multipication
2. Function Parallelism
Definition:
Different processors perform different functions (tasks) on the same or different data
simultaneously.
Example: Video Streaming Pipeline
In a video streaming system:
• Processor 1 → Decode video
• Processor 2 → Apply filters
• Processor 3 → Compress video
• Processor 4 → Send video over network
Each processor performs a different function at the same time.
Other examples:
• Compiler stages (lexical analysis, parsing, optimization)
• Web server handling authentication, logging, and request processing.
3. Control Parallelism
Definition:
Different processors execute different control flows or program paths simultaneously
based on conditions.
Example: Web Server Handling Multiple Requests
A web server receives multiple requests:
• Processor 1 → handles login request
• Processor 2 → handles database query
• Processor 3 → handles file download
Each processor executes different instructions depending on control decisions.
Example platforms:
• Multithreaded servers
• Cloud systems like Apache Spark
Functional Parallel Architectures:
1. Instruction-Level Parallelism (ILP)
Parallelism occurs within a single CPU where multiple instructions are executed
simultaneously.
Types
1. Pipelined Processors
• Instruction execution is divided into stages.
• Different stages execute different instructions simultaneously.
Example stages:
1. Fetch
2. Decode
3. Execute
4. Memory access
5. Write back
Example processor:
• Intel Pentium pipeline architecture.
2. VLIW Processors (Very Long Instruction Word)
• A single long instruction contains multiple operations that are executed in
parallel.
• The compiler decides which instructions run simultaneously.
Example systems:
• Intel Itanium processors.
3. Superscalar Processors
• The CPU can issue multiple instructions per clock cycle.
• Hardware automatically finds independent instructions.
Example:
• Modern CPUs such as Intel Core i7.
2. Thread-Level Parallelism (TLP)
Parallelism occurs when multiple threads of the same program run simultaneously.
Characteristics:
• Threads share the same process memory.
• Usually executed on multi-core processors.
Example:
• A web browser running:
o one thread for UI
o one for network communication
o one for page rendering
Example platform:
• Multithreading using POSIX Threads.
3. Process-Level Parallelism (MIMD)
Parallelism occurs when multiple processes execute different programs or tasks
simultaneously.
This follows the Michael J. Flynn MIMD (Multiple Instruction Multiple Data) model.
1. Distributed Memory MIMD
• Each processor has its own local memory.
• Processors communicate using message passing.
Example systems:
• Cluster computing
• Systems using MPI (Message Passing Interface).
Example real-world platforms:
• Apache Hadoop clusters.
2. Shared Memory MIMD
• Multiple processors share the same memory space.
• Communication occurs through shared variables.
Example:
• Multicore systems using OpenMP.