0% found this document useful (0 votes)

12 views67 pages

Structured Patterns in Parallel Programming

The document discusses structured parallel programming patterns, which are codifications of best practices that enable reusable solutions for specific problems in parallel algorithm design. It covers various control flow patterns, parallel control patterns, data management patterns, and advanced patterns, emphasizing their applications and benefits in creating scalable and maintainable systems. The conclusion highlights the importance of exploring real-world applications and experimenting with frameworks like OpenMP and TBB to enhance parallel programming skills.

Uploaded by

winnerdam47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views67 pages

Structured Patterns in Parallel Programming

Uploaded by

winnerdam47

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Patterns

Based on Structured Parallel Programming

Dr.-Ing Frezewd Lemma

Goal

• Covers Patterns

1
Introduction to Patterns
What Are Patterns? i

• Codification of best practices in software engineering.

• Enable reusable solutions for specific problems.
• In parallel programming: Task distribution and data access
combinations that solves a specific problem in parallel
algorithm design.
• We give each pattern a specific name, which makes it much
easier to succinctly describe, discuss, and compare various
parallel algorithms.
• Algorithms are typically designed by composing patterns
• Provides you a high-level vocabulary for designing your own
algorithms and for understanding other people’s algorithms

2
Why Use Patterns? i

• Universality: Not tied to hardware or language.

• Simplicity: Provide a clear structure for designing algorithms.
• Determinism: Ensures repeatable results for easier debugging
and testing.

3
Nesting Patterns
Nesting Patterns i

• Fundamental compositional pattern in serial and parallel

programming.
• Enables hierarchical and recursive composition of patterns.
• Supports modular and scalable code structures.
• Most programming models, at present support only a fixed
number of nesting levels and may even map these levels
directly onto hardware components.
• Mapping the hierarchy of the program directly onto the
hardware hierarchy also makes code less future-proofed.
• Cilk Plus and TBB support arbitrary nesting while efficiently
mapping parallelism to physical hardware.

4
Nesting Patterns ii

• Examples of systems with limited nesting include:

◦ OpenCL, CUDA, C++ AMP, OpenMP

5
Nesting Example

Figure 1: Example of Nesting Patterns

6
Structured Serial Control Flow
Patterns
Control Flow Patterns: Sequence i

• Tasks execute in a specified order.

• Enforces ordered side effects (e.g., output).
• Example: ‘T = f(A); S = g(T); B = h(S);‘
• restrict the order to be the same as the texture order.
• T = f(A); S = g(A); B = h(S,T);
• the sequence pattern would still require executing g after f
• Superscalar Sequence : a parallel generalization of sequence

7
Control Flow Patterns: Selection i

• Executes one of two branches based on a condition.

• Exactly one branch is executed, ensuring mutual exclusivity.
• Example:
if (c) { a; } else { b; }

• There is a parallel generalization of selection, the speculative

selection pattern, which execute all of a, b, and c in parallel,

8
Control Flow Patterns: Iteration i

• Executes a task repeatedly while a condition holds true.

• The loop body typically modifies a state that the condition
uses for termination testing.
• Iteration assumes data dependencies between loop iterations
are manageable, although parallelization may be complicated
by these dependencies.
• Can use ‘for‘ or ‘while‘ loops.
• Example: ‘while (c) a; ‘

9
Control Flow Patterns: Recursion i

• A function calls itself, directly or indirectly, , enabling nested

tasks.
• It is usually associated with stack-based memory allocation or,
if higher-order functions are supported, closures (see Section
3.4.4) which are objects allocated on the heap.
• Tail recursion is a special form of recursion that can be
converted into iteration
• Examples:
◦ Factorial calculation
◦ Fibonacci sequence
◦ Divide-and-conquer algorithms (e.g., quicksort,
mergesort)

10
Control Flow Patterns: Recursion ii

• Dynamic form of nesting often implemented with stack-based

memory.

11
Parallel Control Patterns
Parallel Control Patterns: Introduction i

• Parallel control patterns extend the serial control patterns

• Parallel control patterns relaxes the assumptions of the serial
control patterns in various ways

12
Fork–Join i

• Splits control flow into parallel tasks and later joins them.
• Applications
◦ Used in frameworks like OpenMP and Cilk Plus.
◦ Divide and Conquer Algorithms: QuickSort: Divide the
array into parts, sort them in parallel, and join the
results. MergeSort: Recursively divide the array, sort the
halves, and merge them in parallel.
◦ Graph algorithms, Dynamic programming etc

13
Map Pattern i

• Applies a function to each element of a collection

independently.

Figure 2: The Map pattern

• Produces a new collection of the same shape.

• Embarrassingly parallel (no dependencies between operations).
• No communication or synchronization needed between tasks.

14
Map Pattern ii

• Ideal for SIMD (Single Instruction, Multiple Data)

architectures.
• Coarse-grained parallelism.
• Scales well with the number of processing units.
• Appllications
◦ Image Processing:
∗ Gamma correction.
∗ Thresholding.
∗ Color space conversions.
◦ Monte Carlo Sampling:
∗ Generating random samples for simulations.
◦ Ray Tracing:

15
Map Pattern iii

∗ Calculating pixel colors in rendering algorithms.

◦ Image filtering, array transformations, or functional
programming.

16
Stencil Pattern i

• Generalization of map pattern with neighborhood access.

• Applications:
◦ Image Filtering:
∗ Convolution.
∗ Median filtering.
◦ Video Processing:
∗ Motion estimation in video encoding.
◦ Scientific Simulations:
∗ Fluid flow simulations.
∗ Electromagnetic partial differential equation (PDE)
solvers.

17
Stencil Pattern ii

∗ Lattice quantum chromodynamics (QCD).

∗ Cellular automata (e.g., lattice Boltzmann methods).
◦ Noise Reduction:
∗ Isotropic diffusion.

18
Reduction i

• Combines all elements of a collection into a single value.

• Uses an associative combiner function (e.g., addition,
multiplication).
• Applications
◦ Numerical Applications:
∗ Summing elements of a collection.
∗ Dot products in matrix multiplication.
∗ Averaging Monte Carlo samples for integration.
◦ Convergence Testing:
∗ Used in iterative solutions of linear equations (e.g.,
conjugate gradient methods).

19
Reduction ii

◦ Image Processing:
∗ Image comparison metrics (e.g., in video encoding).
◦ Other Mathematical Operations:
∗ Maximum, minimum, multiplication, Boolean
AND/OR/XOR, etc.

Pseudocode Example
function reduce(array):
if [Link] == 1: return array[0]
else:
left = reduce(array[:mid])
right = reduce(array[mid:])
return combine(left, right)

20
Scan i

• Computes all partial reductions of a collection.

• Produces an array where each element is the reduction of all
previous elements.
• Applications:
◦ Integration:
∗ Computing cumulative sums or other cumulative
operations.
◦ Financial Modeling:
∗ Sequential decision simulations in option pricing.
◦ Random Number Generation:

21
Scan ii

∗ Generating sequences of random numbers (if

traditional pseudorandom generators are
parallelized).
◦ Data Packing:
∗ Combined with scatter for efficient memory
management.
• Associativity of the operation is crucial for parallelization.

22
Recurrence i

• Generalization of iteration where loop iterations depend on

previous computations.
• Causality: There must be a serial ordering for computation.
• Parallelization:
◦ Associative operations (e.g., prefix sums) can be
parallelized.
◦ Multi-dimensional recurrences are often parallelizable
using sweeps.
• Applications:
◦ Matrix Factorization:
∗ Numerical linear algebra.

23
Recurrence ii

◦ Image Processing:
∗ Advanced filtering operations.
◦ PDE Solvers:
∗ Solving partial differential equations in scientific
computations.
◦ Sequence Alignment:
∗ Used in bioinformatics and computational biology.

24
Serial Data Management Patterns
Random Read and Write i

• Uses pointers or indices for direct memory access.

• Challenges: Aliasing can complicate parallelization.

25
Heap and Stack Allocation i

• Stack: LIFO memory allocation (fast and locality-preserving).

• Heap: Dynamic allocation (slower, can fragment memory).

26
Closures and Objects i

• Closures
◦ Closures are function objects that can be constructed and
managed like data. Lambda functions
◦ When closures are built, they can often be used to
“capture” the state of non-local variables that they
reference
• Objects
◦ Objects are a language construct that associate data with
the code to act on and manage that data.
◦ Multiple functions may be associated with an object and
these functions are called the methods or member
functions of that object.

27
PARALLEL DATA MANAGEMENT
PATTERNS
Pack i

• Eliminates unused or irrelevant elements in a collection.

• Input elements marked with Boolean values:
◦ True: Element is retained.
◦ False: Element is discarded.
• Output: Contiguous sequence of selected elements in the
same order.
• Common Applications:
◦ Filtering results in image processing.
◦ Collision detection in simulations.

28
Pipeline i

• Connects tasks in a producer-consumer relationship.

• Each stage processes data and passes results to the next.
• All stages can run simultaneously, maintaining independent
states.
• Common Applications:
◦ Video encoding and decoding.
◦ Real-time data processing (e.g., audio streams).

29
Geometric Decomposition i

• Divides data into subcollections (overlapping or

non-overlapping).
• Example: Partitioning domains for stencil computations.

30
Partition Pattern i

• Special case of geometric decomposition with non-overlapping

regions.
• Common in matrix operations and image processing.

31
Gather i

• Collects elements from an input collection using an index

array.
• The shape of the output collection matches the index array.
• Optimizations:
◦ Coherent patterns (e.g., shifting or stencil) are more
efficient.
• Common Applications:
◦ Sparse matrix operations.
◦ Ray tracing and volume rendering.

32
Scatter i

• Writes input data to specific locations in a collection using an

index array.
• The inverse of the Gather pattern.
• Challenges:
◦ Handling collisions when multiple writes target the same
index.
◦ Requires a deterministic resolution strategy.
• Common Applications:
◦ Writing results of simulations.
◦ Updating sparse matrices.

33
Scatter ii

34
Advanced Patterns
Superscalar Sequences i

• Allows tasks to execute in parallel as long as data

dependencies are met.
• Example: Out-of-order execution in modern processors.

35
Futures and Speculative Selection i

• Futures: Non-hierarchical task management.

• Speculative Selection: Parallel execution of branches with
cancellation.

36
Workpile i

• Generalization of the Map pattern where tasks can

dynamically generate more tasks.
• Tasks are added to a “workpile” and processed as they arrive.
• Commonly used in:
◦ Recursive algorithms like tree searches.
◦ Dynamic workloads where task count is not predefined.
• Challenges:
◦ Efficient task scheduling.
◦ Avoiding bottlenecks in task generation.

37
Workpile ii

38
Search i

• Finds elements in a collection that meet specific criteria.

• Criteria can vary from simple key matches to complex
logical/arithmetic constraints.
• Applications:
◦ Databases: Relational queries (e.g., SQL).
◦ AI: Pathfinding and heuristic searches.
• Optimization:
◦ Sorting data for faster searches.
◦ Using parallel algorithms for large datasets.

39
Segmentation i

• Divides a collection into multiple segments for independent

processing.
• Each segment can be treated as a separate collection.
• Example:
◦ Summing elements in segments of an array.
• Applications:
◦ Multi-threaded data processing.
◦ Data-parallel algorithms in distributed systems.

40
Expand i

• Creates new data elements or collections based on an existing

one.
• The number of outputs is not predetermined and can vary
based on input.
• Common Applications:
◦ Generating neighboring states in simulations.
◦ Recursive enumeration (e.g., generating permutations or
combinations).
• Challenges:
◦ Managing memory for dynamically generated outputs.
◦ Ensuring scalability with irregular data generation.

41
Expand ii

42
Category Reduction i

• Groups data into categories and applies a reduction operation

within each category.
• Generalization of the Reduction pattern.
• Common Applications:
◦ Grouping sales data by region and computing total
revenue.
◦ Histograms and frequency counts.
• Challenges:
◦ Handling a large number of categories efficiently.
◦ Balancing load in parallel computations.

43
Category Reduction ii

44
PROGRAMMING MODEL
SUPPORT FOR PATTERNS
Programming Model Supports for Patterns i

• TBB: Nesting, Recursion, Fork-Join, Map, Workpile,

Reduction, Scan, Pipeline, Speculative Selection, Branch and
Bound
• OpenMP: Map, Workpile, Reduction, Fork-Join, Stencil,
Gemotric Decompostion, Gather, Scatter
• OpenCL: Map, Gather, Scatter, Reduction, Scan, Pack,
Expand, Stencil, Workpile, Superacalar Sequences, Geometric
Decompostion, Closures

45
Conclusion
Summary i

• Patterns provide reusable solutions for parallel programming.

• Enable scalable, maintainable, and deterministic designs.
• Foundation for developing high-performance systems.

46
Next Steps i

• Explore applications of patterns in real-world systems.

• Experiment with frameworks like OpenMP and TBB.
• Practice composing patterns for complex parallel algorithms.

47
The Map Pattern Implementation i

Definition:

• Applies a function to every element of a data collection in

parallel.
• Executes function invocations independently with no side
effects.

Key Characteristics:

• Foundation for parallelism with no interdependencies.

• Enables maximum concurrency with flexible scheduling.
• Basis of many efficient applications and patterns.

Applications:

48
The Map Pattern Implementation ii

• Embarrassingly parallel tasks and unrelated problem-solving.

• Combined with other patterns like gather, reduction, and scan.

Implementation Considerations:

• Avoids using separate threads for lightweight tasks.

• Parallelizes synchronization overhead efficiently.
• Supports vectorization and serial control flow nesting.

Examples:

• Simplified implementations using TBB, Cilk Plus, OpenMP,

etc.
• Map-Reduce as a practical extension (e.g., Google’s systems).

49
Implementing the Map Pattern Using Different Programming
Models i

• Example Problem: SCALED VECTOR ADDITION - SAXPY

◦ Definition: SAXPY stands for ”Single-Precision A·X
Plus Y.” It performs a scaled vector addition where a
vector x is scaled by a scalar a and added to another
vector y , elementwise.

y ← ax + y

◦ Components:
∗ Inputs: Vector x, scalar a, and vector y .
∗ Output: Updated vector y , with old values
overwritten.

50
Implementing the Map Pattern Using Different Programming
Models ii

◦ Use Cases: Common in linear algebra operations, such

as row cancellation in Gaussian elimination.
◦ Uniform and Varying Parameters:
∗ Uniform Parameters: Scalar a, which remains
constant across operations.
∗ Varying Parameters: Elements of vectors x and y ,
which vary across operations.
◦ Elemental Function Representation: The operation
for each element can be described as:

f (t, p, q) = tp + q

where t = a, p = xi , and q = yi .
51
Implementing the Map Pattern Using Different Programming
Models iii

◦ Parallelization:
∗ Suitable for parallel implementation using the map
pattern, as each iteration is independent.
∗ Arithmetic intensity is low, which limits scalability,
making SAXPY less suitable for highly parallel
systems compared to more complex operations.
◦ Scalability: As a Level 1 BLAS routine, SAXPY does
not scale well due to low arithmetic intensity.
Higher-level BLAS routines (Level 2 and 3) with more
work per unit of parallelism scale better.

52
Implementing the Map Pattern Using Different Programming
Models iv

◦ Implementation Examples: Demonstrated in multiple

parallel programming models, including TBB, OpenMP,
and OpenCL.
◦ Serial Implementation: Provides a baseline, where each
element is processed sequentially in a loop, with
independent iterations.
• Porgramming Models
◦ Serial
◦ TBB
◦ OpenMp
◦ OpenCl

53
Serial Implementation

Figure 3: SAXPY Serial Implementation

54
Map With TBB

Figure 4: SAXPY TBB Implementation 55

Map With OpenMP

Figure 5: SAXPY OpenMP Implementation

56
Map With OpenCl

Figure 6: SAXPY OpenCL Implementation

Structured Parallel Programming Patterns
No ratings yet
Structured Parallel Programming Patterns
210 pages
Discover The Complete Collection of Resources
No ratings yet
Discover The Complete Collection of Resources
71 pages
Preliminary
No ratings yet
Preliminary
169 pages
Programming Models for Parallel Systems
No ratings yet
Programming Models for Parallel Systems
21 pages
Unit 4 HPC
No ratings yet
Unit 4 HPC
43 pages
Understanding Parallel Algorithms and CUDA
No ratings yet
Understanding Parallel Algorithms and CUDA
89 pages
Dokumen - Pub - Patterns For Parallel Programming 0321228111 9780321228116
No ratings yet
Dokumen - Pub - Patterns For Parallel Programming 0321228111 9780321228116
373 pages
Massing Ill 4
No ratings yet
Massing Ill 4
29 pages
(Ebook) Parallel Programming with Microsoft .NET: Design Patterns for Decomposition and Coordination on Multicore Architectures (Patterns & Practices) by Colin Campbell, Ralph Johnson, Ade Miller, Stephen Toub ISBN 9780735651593, 0735651590 online pdf
No ratings yet
(Ebook) Parallel Programming with Microsoft .NET: Design Patterns for Decomposition and Coordination on Multicore Architectures (Patterns & Practices) by Colin Campbell, Ralph Johnson, Ade Miller, Stephen Toub ISBN 9780735651593, 0735651590 online pdf
72 pages
Unit III
No ratings yet
Unit III
31 pages
Parallel Computing Patterns & Optimization
No ratings yet
Parallel Computing Patterns & Optimization
46 pages
Basics of Multithreading in Modern C The Concise Guide 1770487133
No ratings yet
Basics of Multithreading in Modern C The Concise Guide 1770487133
69 pages
Introduction to Parallel Computing Concepts
No ratings yet
Introduction to Parallel Computing Concepts
14 pages
High Performance Computing Overview
No ratings yet
High Performance Computing Overview
42 pages
Parallel Computing Explained: Types & Benefits
No ratings yet
Parallel Computing Explained: Types & Benefits
4 pages
Understanding Parallel Programming Basics
No ratings yet
Understanding Parallel Programming Basics
4 pages
Intermediate Representation: Goals
No ratings yet
Intermediate Representation: Goals
40 pages
Sequence Control in Programming Concepts
No ratings yet
Sequence Control in Programming Concepts
4 pages
Foundations of Parallel Programming
No ratings yet
Foundations of Parallel Programming
45 pages
Arallel Rogramming With Keletons
No ratings yet
Arallel Rogramming With Keletons
6 pages
Parallel Programming Patterns Explained
No ratings yet
Parallel Programming Patterns Explained
89 pages
CUDA Programming for Parallel Computing
No ratings yet
CUDA Programming for Parallel Computing
55 pages
Parallel Programming: Concepts and Practice González-Domínguez PDF Version
No ratings yet
Parallel Programming: Concepts and Practice González-Domínguez PDF Version
184 pages
Understanding Parallel Computing Concepts
No ratings yet
Understanding Parallel Computing Concepts
96 pages
Advanced Computer Architecture Overview
No ratings yet
Advanced Computer Architecture Overview
26 pages
Overview of Parallel Programming Models
No ratings yet
Overview of Parallel Programming Models
84 pages
Understanding Parallel Programming
No ratings yet
Understanding Parallel Programming
42 pages
Unit 4 HPC
No ratings yet
Unit 4 HPC
42 pages
My Coding Strategy in 5 Languages
No ratings yet
My Coding Strategy in 5 Languages
42 pages
Understanding Parallel Skeletons in PRAM
No ratings yet
Understanding Parallel Skeletons in PRAM
6 pages
The BETA Programming Language
No ratings yet
The BETA Programming Language
61 pages
Finding Concurrency Patterns for Parallel Programming
No ratings yet
Finding Concurrency Patterns for Parallel Programming
36 pages
Parallel and Distributed Programming Overview
No ratings yet
Parallel and Distributed Programming Overview
6 pages
Ebook Multicore and GPU Programming: An Integrated Approach by Gerassimos Barlas ISBN 9780128141212, 0128141212 Available All Format
100% (3)
Ebook Multicore and GPU Programming: An Integrated Approach by Gerassimos Barlas ISBN 9780128141212, 0128141212 Available All Format
195 pages
Parallel Programming: Concepts and Practice 1st Edition - Complete Guide
100% (2)
Parallel Programming: Concepts and Practice 1st Edition - Complete Guide
166 pages
Performance Optimization in Parallel Programming
No ratings yet
Performance Optimization in Parallel Programming
40 pages
Illinois NVIDIA GPU Teaching Kit Overview
No ratings yet
Illinois NVIDIA GPU Teaching Kit Overview
64 pages
Recent Trends in Parallel Computing
No ratings yet
Recent Trends in Parallel Computing
12 pages
Understanding Design Patterns in Software
No ratings yet
Understanding Design Patterns in Software
129 pages
Overview of Programming Paradigms
No ratings yet
Overview of Programming Paradigms
6 pages
Programming Concepts and Patterns Guide
No ratings yet
Programming Concepts and Patterns Guide
145 pages
Essential Design Patterns for Embedded Firmware
No ratings yet
Essential Design Patterns for Embedded Firmware
5 pages
Patterns of Parallel Programming
No ratings yet
Patterns of Parallel Programming
117 pages
Overview of Programming Languages
No ratings yet
Overview of Programming Languages
14 pages
Parallel Programming Concepts and Practice 1st Edition Bertil Schmidt Ebook High-Resolution PDF
100% (6)
Parallel Programming Concepts and Practice 1st Edition Bertil Schmidt Ebook High-Resolution PDF
55 pages
Chapter Two Parallel Computing
No ratings yet
Chapter Two Parallel Computing
65 pages
Patterns of Parallel Programming in C#
No ratings yet
Patterns of Parallel Programming in C#
118 pages
Concurrent Collections Programming Model
No ratings yet
Concurrent Collections Programming Model
22 pages
Wa0001.
No ratings yet
Wa0001.
8 pages
Imperative Programming Overview and Paradigms
No ratings yet
Imperative Programming Overview and Paradigms
49 pages
Parallel Computing Architectures & Algorithms
No ratings yet
Parallel Computing Architectures & Algorithms
52 pages
Untitled Document
No ratings yet
Untitled Document
9 pages
CPP Concurrencywithmodern CPP Grimm 2024
No ratings yet
CPP Concurrencywithmodern CPP Grimm 2024
18 pages
Overview of Runtime Environments
No ratings yet
Overview of Runtime Environments
51 pages
(Ebook) Parallel Programming: Concepts and Practice by Bertil Schmidt, Jorge Gonzalez-Dominguez, Christian Hundt, Moritz Schlarb ISBN 9780128498903, 0128498900 Full
100% (1)
(Ebook) Parallel Programming: Concepts and Practice by Bertil Schmidt, Jorge Gonzalez-Dominguez, Christian Hundt, Moritz Schlarb ISBN 9780128498903, 0128498900 Full
64 pages
Parallel Programming Models Explained
No ratings yet
Parallel Programming Models Explained
13 pages
Advanced Imperative Programming Concepts
No ratings yet
Advanced Imperative Programming Concepts
49 pages
Introduction
No ratings yet
Introduction
8 pages
Falcon Universal Confidentiality Agreement
No ratings yet
Falcon Universal Confidentiality Agreement
9 pages
Understanding Parallel Computing Basics
No ratings yet
Understanding Parallel Computing Basics
24 pages
Parallel Software Development Guide
No ratings yet
Parallel Software Development Guide
35 pages
SmartFarm Solutions: Agri-Tech Innovation
100% (1)
SmartFarm Solutions: Agri-Tech Innovation
4 pages
Undecidable Problems in Automata Theory
No ratings yet
Undecidable Problems in Automata Theory
28 pages
2IL50 Data Structures: 2018-19 Q3 Lecture 1: Introduction
No ratings yet
2IL50 Data Structures: 2018-19 Q3 Lecture 1: Introduction
61 pages
Overview of Sorting Algorithms in C++
No ratings yet
Overview of Sorting Algorithms in C++
40 pages
Programming Concepts and Algorithms Guide
No ratings yet
Programming Concepts and Algorithms Guide
36 pages
Rishi Vector Sorting Algorithms
No ratings yet
Rishi Vector Sorting Algorithms
11 pages
Solving Linear Recurrence Relations
No ratings yet
Solving Linear Recurrence Relations
23 pages
KS4 Algorithms Assessment Answers
No ratings yet
KS4 Algorithms Assessment Answers
6 pages
Sorting Algorithms Practice Questions
No ratings yet
Sorting Algorithms Practice Questions
4 pages
Sorting Algorithms Performance Analysis
No ratings yet
Sorting Algorithms Performance Analysis
26 pages
Sorting Algorithms Overview and Analysis
No ratings yet
Sorting Algorithms Overview and Analysis
8 pages
Overview of Sorting Algorithms
No ratings yet
Overview of Sorting Algorithms
13 pages
Bubble and Merge Sort with OpenMP
No ratings yet
Bubble and Merge Sort with OpenMP
5 pages
Applications of O(nlogn) Sorting Algorithms
No ratings yet
Applications of O(nlogn) Sorting Algorithms
3 pages
Understanding Algorithms in Computing
No ratings yet
Understanding Algorithms in Computing
8 pages
Divide and Conquer Algorithm Overview
No ratings yet
Divide and Conquer Algorithm Overview
35 pages
Handling Left Parentheses in Stacks
No ratings yet
Handling Left Parentheses in Stacks
19 pages
TCS NQT 2026 Important Questions
No ratings yet
TCS NQT 2026 Important Questions
11 pages
Linear and Binary Search Explained
No ratings yet
Linear and Binary Search Explained
13 pages
Data Structures Overview and Types
No ratings yet
Data Structures Overview and Types
16 pages
CSC Curriculum and Subject List
No ratings yet
CSC Curriculum and Subject List
14 pages
Sorting Algorithms: Bubble, Insertion, Merge, Quick
No ratings yet
Sorting Algorithms: Bubble, Insertion, Merge, Quick
4 pages
12 Sorting Techniques
No ratings yet
12 Sorting Techniques
75 pages
C++ Programming Concepts and Techniques
No ratings yet
C++ Programming Concepts and Techniques
3 pages
Organizing Songs in Ascending Order
No ratings yet
Organizing Songs in Ascending Order
3 pages
COL106: Algorithms by Ragesh Jaiswal
No ratings yet
COL106: Algorithms by Ragesh Jaiswal
35 pages
Quick Sort vs Merge Sort Explained
No ratings yet
Quick Sort vs Merge Sort Explained
17 pages
Unit-1.2 - Algorithm and Complexity
No ratings yet
Unit-1.2 - Algorithm and Complexity
6 pages
Algorithm Analysis and Notations Guide
No ratings yet
Algorithm Analysis and Notations Guide
16 pages
Parallel and Distributed Query Processing
No ratings yet
Parallel and Distributed Query Processing
7 pages
Sorting Algorithms and Time Complexity Analysis
No ratings yet
Sorting Algorithms and Time Complexity Analysis
18 pages
Sorting and Algorithm Analysis
No ratings yet
Sorting and Algorithm Analysis
32 pages

Structured Patterns in Parallel Programming

Uploaded by

Structured Patterns in Parallel Programming

Uploaded by

Patterns

Based on Structured Parallel Programming

Dr.-Ing Frezewd Lemma

• Codification of best practices in software engineering.

• Universality: Not tied to hardware or language.

• Fundamental compositional pattern in serial and parallel

• Examples of systems with limited nesting include:

Figure 1: Example of Nesting Patterns

• Tasks execute in a specified order.

• Executes one of two branches based on a condition.

• There is a parallel generalization of selection, the speculative

• Executes a task repeatedly while a condition holds true.

• A function calls itself, directly or indirectly, , enabling nested

• Dynamic form of nesting often implemented with stack-based

• Parallel control patterns extend the serial control patterns

• Applies a function to each element of a collection

Figure 2: The Map pattern

• Produces a new collection of the same shape.

• Ideal for SIMD (Single Instruction, Multiple Data)

∗ Calculating pixel colors in rendering algorithms.

• Generalization of map pattern with neighborhood access.

∗ Lattice quantum chromodynamics (QCD).

• Combines all elements of a collection into a single value.

• Computes all partial reductions of a collection.

∗ Generating sequences of random numbers (if

• Generalization of iteration where loop iterations depend on

• Uses pointers or indices for direct memory access.

• Stack: LIFO memory allocation (fast and locality-preserving).

• Eliminates unused or irrelevant elements in a collection.

• Connects tasks in a producer-consumer relationship.

• Divides data into subcollections (overlapping or

• Special case of geometric decomposition with non-overlapping

• Collects elements from an input collection using an index

• Writes input data to specific locations in a collection using an

• Allows tasks to execute in parallel as long as data

• Futures: Non-hierarchical task management.

• Generalization of the Map pattern where tasks can

• Finds elements in a collection that meet specific criteria.

• Divides a collection into multiple segments for independent

• Creates new data elements or collections based on an existing

• Groups data into categories and applies a reduction operation

• TBB: Nesting, Recursion, Fork-Join, Map, Workpile,

• Patterns provide reusable solutions for parallel programming.

• Explore applications of patterns in real-world systems.

• Applies a function to every element of a data collection in

• Foundation for parallelism with no interdependencies.

• Embarrassingly parallel tasks and unrelated problem-solving.

• Avoids using separate threads for lightweight tasks.

• Simplified implementations using TBB, Cilk Plus, OpenMP,

• Example Problem: SCALED VECTOR ADDITION - SAXPY

◦ Use Cases: Common in linear algebra operations, such

◦ Implementation Examples: Demonstrated in multiple

Figure 3: SAXPY Serial Implementation

Figure 4: SAXPY TBB Implementation 55

Figure 5: SAXPY OpenMP Implementation

Figure 6: SAXPY OpenCL Implementation

You might also like