0% found this document useful (0 votes)

12 views5 pages

Understanding Principal Component Analysis

Uploaded by

komal.agarwal5115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views5 pages

Understanding Principal Component Analysis

Uploaded by

komal.agarwal5115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PCA

Understanding Principal Component Analysis (PCA)

Introduction: Principal Component Analysis (PCA) is a powerful statistical technique

widely used in data analysis and dimensionality reduction. It helps to identify
patterns in data, uncover underlying structures, and simplify complex datasets.

Key Concepts:

1. Dimensionality Reduction: PCA aims to reduce the number of variables in a dataset

while preserving the most important information. It accomplishes this by
transforming the original variables into a new set of variables, called principal
components, which are linear combinations of the original variables.
2. Principal Components: These components are orthogonal to each other, meaning
they are uncorrelated. The first principal component captures the maximum variance
in the data, the second principal component captures the maximum remaining
variance orthogonal to the first, and so on.
3. Variance Explained: PCA provides a way to quantify the amount of variance
explained by each principal component. This information is crucial for understanding
how much information is retained after dimensionality reduction.
4. Eigenvalues and Eigenvectors: PCA involves calculating the eigenvalues and
eigenvectors of the covariance matrix of the original data. Eigenvalues represent the
amount of variance explained by each principal component, while eigenvectors
represent the direction of the principal components.

Steps in PCA:

1. Standardization: It is essential to standardize the variables before performing PCA

to ensure that each variable contributes equally to the analysis.
2. Covariance Matrix: Calculate the covariance matrix of the standardized data.
3. Eigenvalue Decomposition: Compute the eigenvalues and eigenvectors of the
covariance matrix.
4. Selection of Principal Components: Decide on the number of principal
components to retain based on the explained variance and the application
requirements.
5. Projection: Transform the original data onto the new coordinate system defined by
the selected principal components.

Applications:

1. Data Compression: PCA is used to reduce the dimensionality of large datasets while
retaining most of the important information, which is beneficial for efficient storage
and processing.
2. Pattern Recognition: PCA is applied in fields such as image processing and
computer vision to identify patterns and extract features from high-dimensional data.
3. Exploratory Data Analysis: PCA helps in visualizing and exploring the underlying
structure of data, making it easier to interpret complex datasets.

Conclusion: Principal Component Analysis is a valuable tool for exploratory data

analysis, dimensionality reduction, and pattern recognition. By transforming high-
dimensional data into a lower-dimensional space, PCA enables better understanding
and visualization of complex datasets, making it an indispensable technique in
various fields of science and engineering.

Let's consider an example of using PCA in the context of a dataset containing information
about different types of fruits based on various attributes such as weight, colour, diameter,
and sweetness level.

Suppose we have a dataset with the following attributes for each fruit:

1. Weight (in grams)

2. Colour (RGB values)
3. Diameter (in centimetres)
4. Sweetness level (measured on a scale from 1 to 10)

We want to perform PCA to reduce the dimensionality of this dataset and identify the most
important factors that contribute to the variability among the fruits.

Step 1: Standardization First, we standardize the attributes to ensure that each variable has
a mean of 0 and a standard deviation of 1. This step is crucial for PCA.

Step 2: Covariance Matrix Next, we calculate the covariance matrix of the standardized
data. The covariance matrix represents the relationships between the different attributes.

Step 3: Eigenvalue Decomposition We compute the eigenvalues and eigenvectors of the

covariance matrix. The eigenvalues represent the amount of variance explained by each
principal component, and the eigenvectors represent the direction of the principal
components.

Step 4: Selection of Principal Components Based on the eigenvalues, we decide on the

number of principal components to retain. We may choose to retain only the principal
components that explain a significant amount of variance in the data.

Step 5: Projection Finally, we transform the original data onto the new coordinate system
defined by the selected principal components. This gives us a lower-dimensional
representation of the dataset.

For example, after performing PCA, we might find that the first principal component is
primarily influenced by attributes related to size (weight and diameter), while the second
principal component is influenced by attributes related to colour and sweetness level.
This reduced representation allows us to analyse and visualize the dataset more effectively,
identifying patterns and similarities among different types of fruits based on the most
important factors.

Let's create a simplified numerical example to illustrate PCA with a small dataset containing
information about three types of fruits: apples, oranges, and bananas. We'll consider two
attributes for each fruit: weight (in grams) and diameter (in centimetres).

Our dataset looks like this:

Fruit Weight Diameter

(gm) (cm)

Apple 100 5

Apple 120 6

Orange 150 7

Orange 140 6.5

Banana 90 4

Banana 110 4.5

Step 1: Standardization

We standardize the data by subtracting the mean and dividing by the s.d. for each attribute.

Standardized Weight (gm) = (weight – Mean(Weight)) / S.D.(Weight)

Standardized Diameter (cm) = (Diameter – Mean(Diameter) / S.D.(Diameter)

Let's assume:
 Mean(Weight) = 120 gm
 S.D.(Weight) = 20 gm
 Mean(Diameter) = 5.5 cm
 S.D.(Diameter) = 1 cm
After standardization, our dataset becomes:

Fruit Standardized Standardized

Weight Diameter

Apple -1.0 -0.5

Apple 0.0 0.5

Orange 1.0 1.5

Orange 0.5 1.0

Banana -1.0 -1.5

Banana 0.0 -1.0

Step 2: Covariance Matrix

Next, we calculate the covariance matrix of the standardized data:

Covariance Weight Diameter

Weight 1.33 1.26
Diameter 1.26 1.33

Covariance Matrix: Weight Diameter ----------------------------------------- Weight 1.33 1.26 Diameter 1.26 1.33
Step 3: Eigenvalue Decomposition

We find the eigenvalues and eigenvectors of the covariance matrix:

 Eigenvalues: λ₁ ≈ 2.59, λ₂ ≈ 0.07

 Eigenvectors: v₁ ≈ [0.71, 0.71], v₂ ≈ [-0.71, 0.71]

Step 4: Selection of Principal Components

Since the first eigenvalue (λ₁) is much larger than the second eigenvalue (λ₂), we retain the
first principal component.

Step 5: Projection
We project the standardized data onto the first principal component:

Projected Data 1D):

Fruit Projected Value

Apple -1.0

Apple 0.35

Orange 1.42

Orange 0.71

Banana -1.42

Banana 0.35

These projected values represent the fruits' positions along the first principal component axis,
effectively reducing the dataset's dimensionality while preserving the most important
information.

Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
32 pages
Principal Components Analysis Overview
No ratings yet
Principal Components Analysis Overview
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
8 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
8 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
17 pages
PCA Lab: A Beginner's Guide
No ratings yet
PCA Lab: A Beginner's Guide
5 pages
Principal Component Analysis Explained
No ratings yet
Principal Component Analysis Explained
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
34 pages
Beginner's Guide to Principal Component Analysis
No ratings yet
Beginner's Guide to Principal Component Analysis
6 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
4 pages
PCA for Dimensionality Reduction Guide
No ratings yet
PCA for Dimensionality Reduction Guide
21 pages
Geo 320
No ratings yet
Geo 320
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
10 pages
PCA Implementation in Python
No ratings yet
PCA Implementation in Python
11 pages
Understanding PCA: Steps & Analysis
No ratings yet
Understanding PCA: Steps & Analysis
2 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
17 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Comprehensive Guide On Principal Component Analysis (PCA)
No ratings yet
Comprehensive Guide On Principal Component Analysis (PCA)
22 pages
Understanding PCA in AI-ML
No ratings yet
Understanding PCA in AI-ML
20 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
51 pages
Simplified Guide to PCA Analysis
No ratings yet
Simplified Guide to PCA Analysis
8 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
25 pages
Step-by-Step Guide to PCA
No ratings yet
Step-by-Step Guide to PCA
7 pages
PCA and Cluster Analysis Overview
No ratings yet
PCA and Cluster Analysis Overview
14 pages
Understanding Principal Component Analysis
100% (1)
Understanding Principal Component Analysis
18 pages
PCA in Remote Sensing Explained
No ratings yet
PCA in Remote Sensing Explained
10 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
26 pages
PCA: Dimensionality Reduction Explained
No ratings yet
PCA: Dimensionality Reduction Explained
28 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
28 pages
PCA: A Step-by-Step Guide
No ratings yet
PCA: A Step-by-Step Guide
11 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
22 pages
Pca 1
No ratings yet
Pca 1
20 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
14 pages
Step-by-Step Guide to PCA Explained
No ratings yet
Step-by-Step Guide to PCA Explained
8 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
9 pages
Dimension Reduction
No ratings yet
Dimension Reduction
4 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Principal Component Analysis Overview
No ratings yet
Principal Component Analysis Overview
16 pages
Understanding PCA Methodology
No ratings yet
Understanding PCA Methodology
18 pages
Math of Principal Component Analysis
No ratings yet
Math of Principal Component Analysis
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
22 pages
PCA in Data Analytics Explained
No ratings yet
PCA in Data Analytics Explained
9 pages
Principal Component Analysis - Wikipedia
No ratings yet
Principal Component Analysis - Wikipedia
28 pages
Factor Analysis and PCA Overview
No ratings yet
Factor Analysis and PCA Overview
26 pages
Factor Analysis and PCA Overview
No ratings yet
Factor Analysis and PCA Overview
28 pages
Data Reduction Techniques in PCA
No ratings yet
Data Reduction Techniques in PCA
36 pages
Dimensionality Reduction with PCA in Python
No ratings yet
Dimensionality Reduction with PCA in Python
11 pages
PCA on Iris Dataset: Dimensionality Reduction
No ratings yet
PCA on Iris Dataset: Dimensionality Reduction
7 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Principal Component Analysis Overview
No ratings yet
Principal Component Analysis Overview
33 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
4 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
6 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
3 pages
PCA for Dimensionality Reduction Guide
No ratings yet
PCA for Dimensionality Reduction Guide
21 pages
Lec 19 20 PCA
No ratings yet
Lec 19 20 PCA
40 pages
Parallelogram Theorems and Properties
No ratings yet
Parallelogram Theorems and Properties
16 pages
Flame Analysis Method for Cu Calibration
No ratings yet
Flame Analysis Method for Cu Calibration
6 pages
Isometric Drawing PDF
No ratings yet
Isometric Drawing PDF
16 pages
SHS Daily Attendance Report June 2020
No ratings yet
SHS Daily Attendance Report June 2020
3 pages
Comparison of Population Growth Theories
100% (1)
Comparison of Population Growth Theories
6 pages
Evaluating Industry Performance with DEA
No ratings yet
Evaluating Industry Performance with DEA
9 pages
Kinematics Practice Questions for Physics
No ratings yet
Kinematics Practice Questions for Physics
8 pages
CSE 1051 Exam Paper - Problem Solving
No ratings yet
CSE 1051 Exam Paper - Problem Solving
3 pages
Logic & Set Theory Activity Guide
No ratings yet
Logic & Set Theory Activity Guide
2 pages
Chapter 5 Harris Module2
100% (1)
Chapter 5 Harris Module2
137 pages
AHIMA Data Quality Management Overview
No ratings yet
AHIMA Data Quality Management Overview
15 pages
Measurement Errors in EE 304
No ratings yet
Measurement Errors in EE 304
14 pages
Assembly Theory: Evolution and Selection
No ratings yet
Assembly Theory: Evolution and Selection
12 pages
Sfumato Technique in Mona Lisa Analysis
No ratings yet
Sfumato Technique in Mona Lisa Analysis
10 pages
Sales Forecasting and Planning Overview
No ratings yet
Sales Forecasting and Planning Overview
9 pages
GD&T Basics and Feature Control Frames
No ratings yet
GD&T Basics and Feature Control Frames
7 pages
B.Tech Civil Engineering Syllabus 2017-18
No ratings yet
B.Tech Civil Engineering Syllabus 2017-18
81 pages
Understanding Prisms and Volume
No ratings yet
Understanding Prisms and Volume
28 pages
Maximal Domain and Functions Analysis
No ratings yet
Maximal Domain and Functions Analysis
2 pages
Cost Concepts and Classifications
No ratings yet
Cost Concepts and Classifications
22 pages
Parallelogram Angle and Area Exercises
No ratings yet
Parallelogram Angle and Area Exercises
5 pages
An Engineering Approach To Design A Non Centrifugal Cane S 2020 Journal of F
No ratings yet
An Engineering Approach To Design A Non Centrifugal Cane S 2020 Journal of F
12 pages
Analyzing MD5 and SHA-1 Performance
No ratings yet
Analyzing MD5 and SHA-1 Performance
4 pages
Stress Concentration in Composite Laminates
No ratings yet
Stress Concentration in Composite Laminates
18 pages
Understanding Sedimentary Rock Grain Size
No ratings yet
Understanding Sedimentary Rock Grain Size
29 pages
ODZ - Optical Design With Zemax 5 Aberrations II
No ratings yet
ODZ - Optical Design With Zemax 5 Aberrations II
40 pages
Reservoir Simulation Overview and Cases
No ratings yet
Reservoir Simulation Overview and Cases
485 pages
Genuine Functional Dependencies with Missing Data
No ratings yet
Genuine Functional Dependencies with Missing Data
13 pages
Research Methodology Exam - Misrata University
No ratings yet
Research Methodology Exam - Misrata University
3 pages
Stability of Fluid-Conveying Composite Pipes
No ratings yet
Stability of Fluid-Conveying Composite Pipes
13 pages