0% found this document useful (0 votes)

9 views14 pages

Principal Component Analysis (PCA) Explained - Built in

Principal Component Analysis (PCA) is a dimensionality reduction technique used to simplify large data sets while preserving significant patterns. The process involves standardizing variables, computing a covariance matrix, and identifying principal components through eigenvectors and eigenvalues. Ultimately, PCA helps in reducing the number of variables for easier data analysis and visualization without losing much information.

Uploaded by

Bikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views14 pages

Principal Component Analysis (PCA) Explained - Built in

Uploaded by

Bikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

FOR EMPLOYERS

D JOBS
AT A S C I E N C E COMPANIES
E X P E R T C O N T R I BARTICLES
UTORS SALARIES COURSES MY ITEMS

A Step-by-Step Explanation of
Principal Component Analysis
(PCA)
Learn how to use a PCA when working with large data sets.

Written by Zakaria Jaadi

Image: Shutterstock / Built In

U P D AT E D BY
Brennan Whitfield | Feb 23, 2024

P rincipal component analysis (PCA) is a widely covered machine learning

method on the web. And while there are some great articles about it, many go

[Link] 1/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

into too much detail. Below we cover how principal component analysis works in a
FOR EMPLOYERS
simple step-by-step way, so everyone can understand it and make use of it — even
those without a strong mathematical background.
JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

Principal component analysis (PCA) is a dimensionality reduction and machine

learning method used to simplify a large data set into a smaller set while still
maintaining significant patterns and trends.

Principal component analysis can be broken down into five steps. I’ll go through each
step, providing logical explanations of what PCA is doing and simplifying
mathematical concepts such as standardization, covariance, eigenvectors and
eigenvalues without focusing on how to compute them.

HOW DO YOU DO A PRINCIPAL COMPONENT ANALYSIS?

1. Standardize the range of continuous initial variables

2. Compute the covariance matrix to identify correlations
3. Compute the eigenvectors and eigenvalues of the covariance matrix to
identify the principal components
4. Create a feature vector to decide which principal components to keep
5. Recast the data along the principal components axes

First, some basic (and brief) background is necessary for context.

What Is Principal Component Analysis?

Principal component analysis, or PCA, is a dimensionality reduction method that is
often used to reduce the dimensionality of large data sets, by transforming a large set

[Link] 2/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

of variables into a smaller one that still contains most of the information in the large
FOR EMPLOYERS
set.

JOBS
ReducingCOMPANIES
the number of ARTICLES SALARIES
variables of a data COURSES
set naturally comes at the expense of MY ITEMS

accuracy, but the trick in dimensionality reduction is to trade a little accuracy for
simplicity. Because smaller data sets are easier to explore and visualize, and thus make
analyzing data points much easier and faster for machine learning algorithms without
extraneous variables to process.

So, to sum up, the idea of PCA is simple: reduce the number of variables of a
data set, while preserving as much information as possible.

What Are Principal Components?

Principal components are new variables that are constructed as linear combinations or
mixtures of the initial variables. These combinations are done in such a way that the
new variables (i.e., principal components) are uncorrelated and most of the
information within the initial variables is squeezed or compressed into the first
components. So, the idea is 10-dimensional data gives you 10 principal components,
but PCA tries to put maximum possible information in the first component, then
maximum remaining information in the second and so on, until having something like
shown in the scree plot below.

[Link] 3/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

FOR EMPLOYERS

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

Percentage of Variance (Information) for each by PC.

Organizing information in principal components this way will allow you to reduce
dimensionality without losing much information, and this by discarding the
components with low information and considering the remaining components as your
new variables.

An important thing to realize here is that the principal components are less
interpretable and don’t have any real meaning since they are constructed as linear
combinations of the initial variables.

Geometrically speaking, principal components represent the directions of the data that
explain a maximal amount of variance, that is to say, the lines that capture most
information of the data. The relationship between variance and information here, is
that, the larger the variance carried by a line, the larger the dispersion of the data
points along it, and the larger the dispersion along a line, the more information it has.
To put all this simply, just think of principal components as new axes that provide the
best angle to see and evaluate the data, so that the differences between the
observations are better visible.

[Link] 4/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

How PCA Constructs the Principal ComponentsFOR EMPLOYERS

JOBS
As there are as many principal
COMPANIES
components
ARTICLES
as there are variables
SALARIES COURSES
in the data, principal
MY ITEMS
components are constructed in such a manner that the first principal component
accounts for the largest possible variance in the data set. For example, let’s assume
that the scatter plot of our data set is as shown below, can we guess the first principal
component ? Yes, it’s approximately the line that matches the purple marks because it
goes through the origin and it’s the line in which the projection of the points (red dots)
is the most spread out. Or mathematically speaking, it’s the line that maximizes the
variance (the average of the squared distances from the projected points (red dots) to
the origin).

The second principal component is calculated in the same way, with the condition that
it is uncorrelated with (i.e., perpendicular to) the first principal component and that it
accounts for the next highest variance.

This continues until a total of p principal components have been calculated, equal to
the original number of variables.

Step-by-Step Explanation of PCA

STEP 1: STANDARDIZATION

[Link] 5/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

The aim of this step is to standardize the range of the continuous initial variables so
FOR EMPLOYERS
that each one of them contributes equally to the analysis.

JOBS
More COMPANIES
specifically, ARTICLES
the reason SALARIES
why it is critical to perform COURSES MY ITEMS
standardization prior to PCA,
is that the latter is quite sensitive regarding the variances of the initial variables. That
is, if there are large differences between the ranges of initial variables, those variables
with larger ranges will dominate over those with small ranges (for example, a variable
that ranges between 0 and 100 will dominate over a variable that ranges between 0
and 1), which will lead to biased results. So, transforming the data to comparable
scales can prevent this problem.

Mathematically, this can be done by subtracting the mean and dividing by the
standard deviation for each value of each variable.

Once the standardization is done, all the variables will be transformed to the same
scale.

STEP 2: COVARIANCE MATRIX COMPUTATION

The aim of this step is to understand how the variables of the input data set are varying
from the mean with respect to each other, or in other words, to see if there is any
relationship between them. Because sometimes, variables are highly correlated in such
a way that they contain redundant information. So, in order to identify these
correlations, we compute the covariance matrix.

The covariance matrix is a p × p symmetric matrix (where p is the number of

dimensions) that has as entries the covariances associated with all possible pairs of the
initial variables. For example, for a 3-dimensional data set with 3 variables x, y, and z,
the covariance matrix is a 3×3 data matrix of this from:

[Link] 6/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

Covariance Matrix for 3-Dimensional Data.

FOR EMPLOYERS
Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)), in the
main
JOBS diagonal (Top left to
COMPANIES bottom right) SALARIES
ARTICLES we actually haveCOURSES
the variances of each initial
MY ITEMS
variable. And since the covariance is commutative (Cov(a,b)=Cov(b,a)), the entries of
the covariance matrix are symmetric with respect to the main diagonal, which means
that the upper and the lower triangular portions are equal.

What do the covariances that we have as entries of the matrix tell us about
the correlations between the variables?

It’s actually the sign of the covariance that matters:

If positive then: the two variables increase or decrease together (correlated)

If negative then: one increases when the other decreases (Inversely correlated)

Now that we know that the covariance matrix is not more than a table that summarizes
the correlations between all the possible pairs of variables, let’s move to the next step.

STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF

THE COVARIANCE MATRIX TO IDENTIFY THE PRINCIPAL
COMPONENTS

Eigenvectors and eigenvalues are the linear algebra concepts that we need to compute
from the covariance matrix in order to determine the principal components of the
data.

What you first need to know about eigenvectors and eigenvalues is that they always
come in pairs, so that every eigenvector has an eigenvalue. Also, their number is equal
to the number of dimensions of the data. For example, for a 3-dimensional data set,
there are 3 variables, therefore there are 3 eigenvectors with 3 corresponding
eigenvalues.

It is eigenvectors and eigenvalues who are behind all the magic of principal
components because the eigenvectors of the Covariance matrix are

[Link] 7/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

actually the directions of the axes where there is the most variance (most information)
FOR EMPLOYERS
and that we call Principal Components. And eigenvalues are simply the coefficients
attached to eigenvectors, which give the amount of variance carried in each Principal
JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS
Component.

By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you get
the principal components in order of significance.

Principal Component Analysis Example:

Let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the
eigenvectors and eigenvalues of the covariance matrix are as follows:

If we rank the eigenvalues in descending order, we get λ1>λ2, which means that the
eigenvector that corresponds to the first principal component (PC1) is v1 and the one
that corresponds to the second principal component (PC2) is v2.

After having the principal components, to compute the percentage of variance

(information) accounted for by each component, we divide the eigenvalue of each
component by the sum of eigenvalues. If we apply this on the example above, we find
that PC1 and PC2 carry respectively 96 percent and 4 percent of the variance of the
data.

STEP 4: CREATE A FEATURE VECTOR

As we saw in the previous step, computing the eigenvectors and ordering them by their
eigenvalues in descending order, allow us to find the principal components in order of
significance. In this step, what we do is, to choose whether to keep all these
components or discard those of lesser significance (of low eigenvalues), and form with
the remaining ones a matrix of vectors that we call Feature vector.
[Link] 8/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

So, the feature vector is simply a matrix that has as columns the eigenvectors of the
FOR EMPLOYERS
components that we decide to keep. This makes it the first step towards dimensionality
reduction, because if we choose to keep only p eigenvectors (components) out of n, the
JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS
final data set will have only p dimensions.

Principal Component Analysis Example:

Continuing with the example from the previous step, we can either form a feature
vector with both of the eigenvectors v1 and v2:

Or discard the eigenvector v2, which is the one of lesser significance, and form a
feature vector with v1 only:

Discarding the eigenvector v2 will reduce dimensionality by 1, and will consequently

cause a loss of information in the final data set. But given that v2 was carrying only 4
percent of the information, the loss will be therefore not important and we will still
have 96 percent of the information that is carried by v1.

So, as we saw in the example, it’s up to you to choose whether to keep all the
components or discard the ones of lesser significance, depending on what you are
looking for. Because if you just want to describe your data in terms of new variables
(principal components) that are uncorrelated without seeking to reduce
dimensionality, leaving out lesser significant components is not needed.

STEP 5: RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS

AXES

In the previous steps, apart from standardization, you do not make any changes on the
data, you just select the principal components and form the feature vector, but the

[Link] 9/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

input data set remains always in terms of the original axes (i.e, in terms of the initial
FOR EMPLOYERS
variables).

JOBS
In COMPANIES
this step, which is the ARTICLES
last one, the aimSALARIES COURSES
is to use the feature vector formed using MY
theITEMS
eigenvectors of the covariance matrix, to reorient the data from the original axes to the
ones represented by the principal components (hence the name Principal Components
Analysis). This can be done by multiplying the transpose of the original data set by the
transpose of the feature vector.

Principal Component Analysis (PCA)

An overview of principal component analysis (PCA). | Video: Visually Explained

References:

[Steven M. Holland, Univ. of Georgia]: Principal Components Analysis

[[Link]]: Eigenvectors, Eigenvalues, PCA, Covariance and Entropy
[Lindsay I. Smith]: A tutorial on Principal Component Analysis

[Link] 10/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

FOR EMPLOYERS

Frequently Asked Questions

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

What does a PCA plot tell you?

Why is PCA used in machine learning?

Subscribe to Built In to get tech articles + jobs in your inbox.

Your Expertise

Email Address

RECENT DATA SCIENCE ARTICLES

What Enterprises Need to Know Before Adopting a LLM

[Link] 11/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

FOR EMPLOYERS

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

What Is Open Source Intelligence (OSINT)?

Don’t Be Compliance-First. Be Risk-First Instead.

Data Science Expert Contributors

Expert Contributors
Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by
innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-
person accounts of problem-solving on the road to innovation.

LEARN MORE

[Link] 12/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

Great Companies Need Great People. That's Where We Come

FORIn.
EMPLOYERS

JOBS COMPANIES ARTICLES RECRUIT WITH US

SALARIES COURSES MY ITEMS

Built In is the online community for startups and tech companies. Find startup jobs, tech news and
events.

About

Our Story

Careers

Our Staff Writers

Content Descriptions

Company News

Get Involved

Recruit With Built In

Become an Expert Contributor

Send Us a News Tip

Resources

Customer Support

Share Feedback

Report a Bug

[Link] 13/14
3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

Tech A-Z
FOR EMPLOYERS
Browse Jobs

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

Tech Hubs

Built In Austin

Built In Boston

Built In Chicago

Built In Colorado

Built In LA

Built In NYC

Built In San Francisco

Built In Seattle

See All Tech Hubs

Learning Lab User Agreement

Accessibility Statement

Your Privacy Choices/Cookie Settings

CA Notice of Collection

[Link] 14/14

PCA: A Step-by-Step Guide
No ratings yet
PCA: A Step-by-Step Guide
11 pages
Simplified Guide to PCA Analysis
No ratings yet
Simplified Guide to PCA Analysis
8 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Practical Guide To Principal Component Analysis (PCA) in R & Python
No ratings yet
Practical Guide To Principal Component Analysis (PCA) in R & Python
33 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
27 pages
Step-by-Step Guide to PCA
No ratings yet
Step-by-Step Guide to PCA
7 pages
Principal Component Analysis Explained
No ratings yet
Principal Component Analysis Explained
14 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
17 pages
PCA: Dimensionality Reduction Explained
No ratings yet
PCA: Dimensionality Reduction Explained
28 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
32 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
31 pages
TODEL
No ratings yet
TODEL
9 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
41 pages
PCA - Towards Data Science
No ratings yet
PCA - Towards Data Science
30 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
45 pages
Unsupervised Learning Techniques Explained
No ratings yet
Unsupervised Learning Techniques Explained
39 pages
PCA for Data Analysis Explained
No ratings yet
PCA for Data Analysis Explained
20 pages
Principal Component Analysis Explained
No ratings yet
Principal Component Analysis Explained
3 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Principal Component Analysis (PCA) - GeeksforGeeks
No ratings yet
Principal Component Analysis (PCA) - GeeksforGeeks
12 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
3 pages
PCA: Supervised vs. Unsupervised Learning
No ratings yet
PCA: Supervised vs. Unsupervised Learning
11 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
13 pages
Dimension Reduction
No ratings yet
Dimension Reduction
4 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
22 pages
PCA Tutorial in R with ggbiplot
No ratings yet
PCA Tutorial in R with ggbiplot
54 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
9 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
30 pages
Principal Components Analysis Overview
No ratings yet
Principal Components Analysis Overview
3 pages
Clustering & Dimensionality Reduction Techniques
No ratings yet
Clustering & Dimensionality Reduction Techniques
15 pages
PCA Mechanics and Modelling Benefits
No ratings yet
PCA Mechanics and Modelling Benefits
5 pages
Geo 320
No ratings yet
Geo 320
3 pages
PCA Applications in Finance Explained
No ratings yet
PCA Applications in Finance Explained
38 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
34 pages
PCA Implementation in Python
No ratings yet
PCA Implementation in Python
11 pages
Understanding Dimensionality Reduction Techniques
No ratings yet
Understanding Dimensionality Reduction Techniques
123 pages
PCA Lab: A Beginner's Guide
No ratings yet
PCA Lab: A Beginner's Guide
5 pages
PCA on Iris Dataset: Dimensionality Reduction
No ratings yet
PCA on Iris Dataset: Dimensionality Reduction
7 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
8 pages
Step-by-Step Guide to PCA Explained
No ratings yet
Step-by-Step Guide to PCA Explained
8 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
36 pages
PCA in Remote Sensing Explained
No ratings yet
PCA in Remote Sensing Explained
10 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
7 pages
Linear Discriminant Analysis
No ratings yet
Linear Discriminant Analysis
20 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
6 pages
Jolliffe 2014
No ratings yet
Jolliffe 2014
5 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
6 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
8 pages
Data Reduction Techniques Explained
No ratings yet
Data Reduction Techniques Explained
9 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
30 pages
PCA Ref
No ratings yet
PCA Ref
21 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
8 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
21 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
26 pages
Understanding PCA in AI-ML
No ratings yet
Understanding PCA in AI-ML
20 pages
Essential Data Reduction Techniques
No ratings yet
Essential Data Reduction Techniques
44 pages
Lesson Plan 7ionisation Energies
No ratings yet
Lesson Plan 7ionisation Energies
6 pages
Sam Harris Vs William Lane Craig2011
No ratings yet
Sam Harris Vs William Lane Craig2011
82 pages
Understanding Rural Sociology Dynamics
No ratings yet
Understanding Rural Sociology Dynamics
22 pages
Pump Station Design for Wastewater Treatment
No ratings yet
Pump Station Design for Wastewater Treatment
47 pages
Understanding Editorial Writing
No ratings yet
Understanding Editorial Writing
175 pages
Listening Skills Interview Activity
No ratings yet
Listening Skills Interview Activity
13 pages
Vocational Training for Special Needs
No ratings yet
Vocational Training for Special Needs
10 pages
Durability Index Framework for Concrete Design
No ratings yet
Durability Index Framework for Concrete Design
16 pages
Research Methods in Banking & HRM
No ratings yet
Research Methods in Banking & HRM
5 pages
The Booz Allen Earned Schedule' Experience Peece: How We've Applied It What We've Learned
No ratings yet
The Booz Allen Earned Schedule' Experience Peece: How We've Applied It What We've Learned
29 pages
Package Holidays: Benefits & Options
No ratings yet
Package Holidays: Benefits & Options
3 pages
XRF Quick User Guide
No ratings yet
XRF Quick User Guide
6 pages
Lifeboat Davit System Specifications
No ratings yet
Lifeboat Davit System Specifications
6 pages
Digital Management Education Programs
No ratings yet
Digital Management Education Programs
2 pages
Analisis Kata Ajakan dalam Teks Persuasi
No ratings yet
Analisis Kata Ajakan dalam Teks Persuasi
6 pages
3D Finite-Element Analysis of Substandard RC Columns Strengthened by Fiber-Reinforced Polymer Sheets
No ratings yet
3D Finite-Element Analysis of Substandard RC Columns Strengthened by Fiber-Reinforced Polymer Sheets
40 pages
Elevated Steam Traps for Tracer Systems
No ratings yet
Elevated Steam Traps for Tracer Systems
8 pages
04 Pipe Conveyors
100% (3)
04 Pipe Conveyors
77 pages
Cytogenetics Course Overview by Gramonte
No ratings yet
Cytogenetics Course Overview by Gramonte
7 pages
NGR Panel Specifications for Aneka Diesel
No ratings yet
NGR Panel Specifications for Aneka Diesel
8 pages
Is 5512 1983 PDF
No ratings yet
Is 5512 1983 PDF
17 pages
Expertise in Civil Engineering at ENIT
No ratings yet
Expertise in Civil Engineering at ENIT
11 pages
GeoGebra Bookmark Design Project
No ratings yet
GeoGebra Bookmark Design Project
10 pages
Flexe en
No ratings yet
Flexe en
585 pages
Grade 2 Term 2 Maths Schemes
No ratings yet
Grade 2 Term 2 Maths Schemes
14 pages
Understanding Non-Geographic Numbers in Pakistan
No ratings yet
Understanding Non-Geographic Numbers in Pakistan
7 pages
School Uniforms: Pros and Cons Explained
No ratings yet
School Uniforms: Pros and Cons Explained
4 pages
Taylor Swift's Music and Lifelong Learning in Thailand
No ratings yet
Taylor Swift's Music and Lifelong Learning in Thailand
23 pages
Chase Total Checking Statement Summary
No ratings yet
Chase Total Checking Statement Summary
2 pages
Waterproof Connectors Pricing Guide
No ratings yet
Waterproof Connectors Pricing Guide
1 page

Principal Component Analysis (PCA) Explained - Built in

Uploaded by

Principal Component Analysis (PCA) Explained - Built in

Uploaded by

3/23/24, 6:56 PM Principal Component Analysis (PCA) Explained | Built In

Written by Zakaria Jaadi

Image: Shutterstock / Built In

P rincipal component analysis (PCA) is a widely covered machine learning

WHAT IS PRINCIPAL COMPONENT ANALYSIS?

Principal component analysis (PCA) is a dimensionality reduction and machine

HOW DO YOU DO A PRINCIPAL COMPONENT ANALYSIS?

1. Standardize the range of continuous initial variables

First, some basic (and brief) background is necessary for context.

What Is Principal Component Analysis?

What Are Principal Components?

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

Percentage of Variance (Information) for each by PC.

How PCA Constructs the Principal ComponentsFOR EMPLOYERS

Step-by-Step Explanation of PCA

STEP 2: COVARIANCE MATRIX COMPUTATION

The covariance matrix is a p × p symmetric matrix (where p is the number of

Covariance Matrix for 3-Dimensional Data.

It’s actually the sign of the covariance that matters:

If positive then: the two variables increase or decrease together (correlated)

STEP 3: COMPUTE THE EIGENVECTORS AND EIGENVALUES OF

Principal Component Analysis Example:

After having the principal components, to compute the percentage of variance

STEP 4: CREATE A FEATURE VECTOR

Principal Component Analysis Example:

Discarding the eigenvector v2 will reduce dimensionality by 1, and will consequently

STEP 5: RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS

Principal Component Analysis (PCA)

An overview of principal component analysis (PCA). | Video: Visually Explained

[Steven M. Holland, Univ. of Georgia]: Principal Components Analysis

Frequently Asked Questions

What does a PCA plot tell you?

Why is PCA used in machine learning?

Subscribe to Built In to get tech articles + jobs in your inbox.

RECENT DATA SCIENCE ARTICLES

What Enterprises Need to Know Before Adopting a LLM

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

What Is Open Source Intelligence (OSINT)?

Don’t Be Compliance-First. Be Risk-First Instead.

Data Science Expert Contributors

Great Companies Need Great People. That's Where We Come

JOBS COMPANIES ARTICLES RECRUIT WITH US

Our Staff Writers

Recruit With Built In

Become an Expert Contributor

Send Us a News Tip

JOBS COMPANIES ARTICLES SALARIES COURSES MY ITEMS

Built In San Francisco

See All Tech Hubs

Learning Lab User Agreement

Your Privacy Choices/Cookie Settings

You might also like