Lecture 46 - Vector Processing

The document discusses Vector (SIMD) Processing, which allows parallel operations on multiple data elements using single-instruction multiple-data (SIMD) instructions. It highlights the importance of data parallelism and vector registers in enhancing processor performance, as well as the role of vectorizing compilers in optimizing loops for vector instructions. An example illustrates how conventional assembly instructions can be replaced with vector instructions to improve efficiency in processing arrays.

Uploaded by

24f3002835

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views7 pages

Lecture 46 - Vector Processing

Uploaded by

24f3002835

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Vector (SIMD)Processing

Carl Hamacher, Zvonko Vranesic and Safwat Zaky, Computer Organization and
Embedded Systems, (6e), McGraw Hill Publication, 2017.
Ch 12: 12.2

1
Vector (SIMD)Processing
 Many computationally demanding applications involve programs that use loops to perform operations on
vectors of data, where a vector is an array of elements such as integers or floating-point numbers.
 When a processor executes the instructions in such a loop, the operations are performed one at a time on
individual vector elements.
 Many instructions need to be executed to process all vector elements.
 A processor can be enhanced with multiple ALUs.
 It is possible to operate on multiple data elements in parallel using a single instruction.
 Such instructions are called single-instruction multiple-data (SIMD) instructions. They are also called vector
instructions.
 These instructions can only be used when the operations performed in parallel are independent. This is known
as data parallelism.
 The data for vector instructions are held in vector registers, each of which can hold several data elements. The
number of elements, L, in each vector register is called the vector length. 2

 It determines the number of operations that can be performed in parallel on multiple ALUs.
Vector (SIMD)Processing
 The vector instruction
VectorAdd.S Vi, Vj, Vk
 computes L sums using the elements in vector registers Vj and Vk, and places the resulting sums in
vector register Vi.
 Suffix S denotes the size of each data element
 Special instructions are needed to transfer multiple data elements between a vector register and the
memory. The instruction
VectorLoad.S Vi, X(Rj)
 causes L consecutive elements beginning at memory location X + [Rj] to be loaded into vector
register Vi. Similarly, the instruction
VectorStore.S Vi, X(Rj)
 causes the contents of vector register Vi to be stored as L consecutive locations in the memory.
3
Vectorization
 In a source program written in a high-level language, loops that operate on arrays of integers or
floating-point numbers are vectorizable if the operations performed in each pass are independent of
the other passes.
 Using vector instructions reduces the number of instructions that need to be executed
 Enables the operations to be performed in parallel on multiple ALUs.
 A vectorizing compiler can recognize such loops, if they are not too complex, and generate vector
instructions.

4
Vectorization Example
 Consider vectorization of the loop given below

 Assume that the starting locations in memory for arrays A, B, and C are in registers R2, R3, and R4.
Using conventional assembly-language instructions, the compiler may generate the loop.

5
Vectorization Example Contd..
 The Load, Add, and Store instructions at the beginning of the loop are replaced by
corresponding vector instructions that operate on L elements at a time.
 The vectorized loop requires only N/L passes to process all of the data in the arrays.
 With L elements processed in each pass through the loop, the address pointers in
registers R2, R3, and R4 are incremented by 4L, and the count in register R5 is
decremented by L.

6
Vectorization Example Contd..
 Vectorized form of the loop

Understanding Vector SIMD Processing
No ratings yet
Understanding Vector SIMD Processing
7 pages
SIMD and GPU: Vector Processing Insights
No ratings yet
SIMD and GPU: Vector Processing Insights
44 pages
Vector vs. Array Processors Explained
No ratings yet
Vector vs. Array Processors Explained
16 pages
Vectorization Techniques in Parallel Programming
No ratings yet
Vectorization Techniques in Parallel Programming
38 pages
Understanding SIMD in Microprocessors
No ratings yet
Understanding SIMD in Microprocessors
33 pages
Simd Introduction
No ratings yet
Simd Introduction
23 pages
SIMD and Vector Processing Explained
No ratings yet
SIMD and Vector Processing Explained
16 pages
Simd Vectorization
No ratings yet
Simd Vectorization
4 pages
CA - Lec08-Chpater 4-DLP in Vector SIMD and GPU Architectures
No ratings yet
CA - Lec08-Chpater 4-DLP in Vector SIMD and GPU Architectures
108 pages
Understanding SIMD Architecture in Computing
No ratings yet
Understanding SIMD Architecture in Computing
67 pages
Understanding SIMD Architecture in Computing
No ratings yet
Understanding SIMD Architecture in Computing
67 pages
SIMD Architecture Overview and Applications
No ratings yet
SIMD Architecture Overview and Applications
33 pages
Compiler Autovectorization Guide
No ratings yet
Compiler Autovectorization Guide
41 pages
CA Lecture 13
No ratings yet
CA Lecture 13
16 pages
Vector Supercomputers Overview
No ratings yet
Vector Supercomputers Overview
40 pages
Vector Processing Techniques and Architectures
No ratings yet
Vector Processing Techniques and Architectures
38 pages
GPU SIMD Architecture Overview
No ratings yet
GPU SIMD Architecture Overview
26 pages
Vector Architecture in Computer Systems
No ratings yet
Vector Architecture in Computer Systems
35 pages
ACA Lecture (9) 14 4 2025 DLP
No ratings yet
ACA Lecture (9) 14 4 2025 DLP
81 pages
Vector Processing in Supercomputers
No ratings yet
Vector Processing in Supercomputers
7 pages
8 TH
No ratings yet
8 TH
11 pages
Vector Processing in Computer Architecture
No ratings yet
Vector Processing in Computer Architecture
31 pages
Vector vs. Array Processors Overview
No ratings yet
Vector vs. Array Processors Overview
40 pages
SIMD Processors in Computer Architecture
No ratings yet
SIMD Processors in Computer Architecture
64 pages
CAO Lecture 12 Vector Processors and GPUs
No ratings yet
CAO Lecture 12 Vector Processors and GPUs
62 pages
Understanding Vector Processors in SIMD
No ratings yet
Understanding Vector Processors in SIMD
83 pages
Vector Processing in Computer Architecture
No ratings yet
Vector Processing in Computer Architecture
42 pages
SIMD Processing in Computer Architecture
No ratings yet
SIMD Processing in Computer Architecture
60 pages
Data-Level Parallelism in Computer Architecture
No ratings yet
Data-Level Parallelism in Computer Architecture
11 pages
Data-Level Parallelism in SIMD Architectures
No ratings yet
Data-Level Parallelism in SIMD Architectures
15 pages
Unit 3 Notes on Vector Processing
No ratings yet
Unit 3 Notes on Vector Processing
35 pages
Understanding Data-Level Parallelism
No ratings yet
Understanding Data-Level Parallelism
16 pages
Understanding Data-Level Parallelism in Vectors
No ratings yet
Understanding Data-Level Parallelism in Vectors
34 pages
Vector Supercomputer Architecture Overview
No ratings yet
Vector Supercomputer Architecture Overview
31 pages
Module 3 - Aarifko
No ratings yet
Module 3 - Aarifko
117 pages
Understanding Parallel Processing Architectures
No ratings yet
Understanding Parallel Processing Architectures
18 pages
Understanding Vector Processing
No ratings yet
Understanding Vector Processing
20 pages
CS7103 - MultiCore Architecture Ppts Unit-II
No ratings yet
CS7103 - MultiCore Architecture Ppts Unit-II
43 pages
l22 Vector
No ratings yet
l22 Vector
32 pages
An Introduction To Vectorization With Intel C++ Compiler 021712
No ratings yet
An Introduction To Vectorization With Intel C++ Compiler 021712
7 pages
Understanding Parallel Computing Concepts
No ratings yet
Understanding Parallel Computing Concepts
27 pages
BCS 702: Parallel Computing Notes
No ratings yet
BCS 702: Parallel Computing Notes
25 pages
Vector Processors in Computer Architecture
No ratings yet
Vector Processors in Computer Architecture
17 pages
SIMD Vectorization Techniques at CMU
No ratings yet
SIMD Vectorization Techniques at CMU
37 pages
Understanding Data Hazards and Vector Processors
100% (1)
Understanding Data Hazards and Vector Processors
5 pages
Vector Processor Architecture Overview
No ratings yet
Vector Processor Architecture Overview
13 pages
Automatic Loop Vectorization Techniques
No ratings yet
Automatic Loop Vectorization Techniques
14 pages
Vector Processing 2
No ratings yet
Vector Processing 2
21 pages
Microprocessor Array System Overview
No ratings yet
Microprocessor Array System Overview
7 pages
Matrix and Vector Processor Overview
No ratings yet
Matrix and Vector Processor Overview
12 pages
Auto-Vectorization for Intel AVX
No ratings yet
Auto-Vectorization for Intel AVX
12 pages
Understanding Instruction-Level Parallelism
No ratings yet
Understanding Instruction-Level Parallelism
27 pages
Speculation in Computer Architecture
No ratings yet
Speculation in Computer Architecture
22 pages
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
No ratings yet
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
50 pages
Module1 PP BDS701 Notes
No ratings yet
Module1 PP BDS701 Notes
31 pages
Frank: A Functional Language with Effects
No ratings yet
Frank: A Functional Language with Effects
15 pages
PHP Session and Curl Errors Explained
No ratings yet
PHP Session and Curl Errors Explained
5 pages
BG95 BG77 BG600L Thread Management Guide
No ratings yet
BG95 BG77 BG600L Thread Management Guide
57 pages
Angular Development Fundamentals Guide
No ratings yet
Angular Development Fundamentals Guide
9 pages
gRPC vs Apache Thrift in E-commerce Performance
No ratings yet
gRPC vs Apache Thrift in E-commerce Performance
3 pages
8051 C Programming Overview
No ratings yet
8051 C Programming Overview
46 pages
Principles of Software Design Engineering
No ratings yet
Principles of Software Design Engineering
23 pages
Automating Webex Teams Room Listings
No ratings yet
Automating Webex Teams Room Listings
2 pages
Linked List Micro Project Proposal & Report
No ratings yet
Linked List Micro Project Proposal & Report
9 pages
AJP MCQ Questions and Answers
No ratings yet
AJP MCQ Questions and Answers
4 pages
Character Array Declaration in C++
No ratings yet
Character Array Declaration in C++
13 pages
Preventing SQL Injections in SAP
No ratings yet
Preventing SQL Injections in SAP
2 pages
Introduction to Unix Shell Programming
No ratings yet
Introduction to Unix Shell Programming
22 pages
Key Functions in Informatica
No ratings yet
Key Functions in Informatica
15 pages
Lesson Plan of DSD
No ratings yet
Lesson Plan of DSD
1 page
SystemVerilog Randomization Constraints Examples
No ratings yet
SystemVerilog Randomization Constraints Examples
16 pages
The Design of The UNIX Operating System Maurice J. Bach Instant Download Full Chapters
100% (4)
The Design of The UNIX Operating System Maurice J. Bach Instant Download Full Chapters
183 pages
NIT Jamshedpur Leaderboard Profile
No ratings yet
NIT Jamshedpur Leaderboard Profile
1 page
Dynamic Memory Allocation in C
No ratings yet
Dynamic Memory Allocation in C
4 pages
Infix to Postfix Conversion Tool
No ratings yet
Infix to Postfix Conversion Tool
5 pages
Eclipse SOP Shortcuts Guide
No ratings yet
Eclipse SOP Shortcuts Guide
18 pages
Custom Railway Tracking App Overview
No ratings yet
Custom Railway Tracking App Overview
48 pages
BCA Honours Proposed Syllabus 2019-20
No ratings yet
BCA Honours Proposed Syllabus 2019-20
29 pages
ICSE Class 10 OOP Concepts Overview
No ratings yet
ICSE Class 10 OOP Concepts Overview
94 pages
Git Keyboard Shortcuts Cheat Sheet
No ratings yet
Git Keyboard Shortcuts Cheat Sheet
1 page
Manual
No ratings yet
Manual
168 pages
Apex Class for Account Updates
No ratings yet
Apex Class for Account Updates
59 pages
Introduction to Operating Systems Concepts
No ratings yet
Introduction to Operating Systems Concepts
30 pages
Department of IT, Panimalar Engineering College, Chennai
No ratings yet
Department of IT, Panimalar Engineering College, Chennai
32 pages
Pointers, Arrays, and Strings Explained
No ratings yet
Pointers, Arrays, and Strings Explained
47 pages

Lecture 46 - Vector Processing

Uploaded by

Lecture 46 - Vector Processing

Uploaded by

Vector (SIMD)Processing

You might also like