0% found this document useful (0 votes)
12 views8 pages

PCA and Eigenvalue Analysis in MATLAB

The document contains MATLAB code for solving linear systems, computing eigenvalues and eigenvectors, and performing Principal Component Analysis (PCA) on datasets. It includes multiple questions, each demonstrating different mathematical concepts and operations, such as diagonalization and data visualization. The results of computations, including eigenvalues, eigenvectors, and explained variance, are displayed for various datasets including synthetic data and the Iris dataset.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

PCA and Eigenvalue Analysis in MATLAB

The document contains MATLAB code for solving linear systems, computing eigenvalues and eigenvectors, and performing Principal Component Analysis (PCA) on datasets. It includes multiple questions, each demonstrating different mathematical concepts and operations, such as diagonalization and data visualization. The results of computations, including eigenvalues, eigenvectors, and explained variance, are displayed for various datasets including synthetic data and the Iris dataset.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

%ques 1

disp('ques 1')
% Coefficient matrix A
A = [2, 3, -1;4, -1, 2;-1, 2, 3]
% Right-hand side vector B
B = [5; 6; 4]
% Solve the linear system
X = A\B
% Display the solution
disp('solution of the linear system')
disp(X)

%ques2
disp('ques 2')
% Define matrix A
A = [2, 1, 1;
1, 3, 2;
1, 2, 2]
%Compute eigenvalues and eigenvectors
[V, D] = eig(A)
% Display results
disp('Eigenvalues (diagonal of D)')
disp(diag(D))
disp('Eigenvectors (columns of V)')
disp(V)

%ques 3
disp('ques 3')
% Plot eigenvectors
origin = [0, 0, 0]; % Origin for vectors
quiver3(origin(1), origin(2), origin(3), V(1,1), V(2,1), V(3,1), 'r',
'LineWidth', 2); hold on;
quiver3(origin(1), origin(2), origin(3), V(1,2), V(2,2), V(3,2), 'g',
'LineWidth', 2);
quiver3(origin(1), origin(2), origin(3), V(1,3), V(2,3), V(3,3), 'b',
'LineWidth', 2);
% Add labels
xlabel('x-axis');
ylabel('y-axis');
zlabel('z-axis');
title('Eigenvectors of Matrix A');
grid on;
%legend('Eigenvector', 'Eigenvector', 'Eigenvector');
hold off;

%ques 4
disp('ques4')
% Define a symmetric matrix
A = [4, 1, 2;
1, 3, 1;
2, 1, 3];

1
% Compute eigenvalues and eigenvectors
[V, D] = eig(A);
% Check orthogonality of eigenvectors
orthogonality_check = V' * V;
disp('Eigenvectors (V):');
disp(V);
disp('Orthogonality Check (V'' * V):');
disp(orthogonality_check);
% Diagonalization
A_diag = V * D * V';
disp('Diagonalized Matrix (V * D * V''):');
disp(A_diag);

%ques 5
disp('ques 5')
% Generate a synthetic dataset
X = [2.5, 2.4;
0.5, 0.7;
2.2, 2.9;
1.9, 2.2;
3.1, 3.0];
% Center the data
X_mean = mean(X);
X_centered = X - X_mean;
% Compute covariance matrix
covariance_matrix = cov(X_centered);
% Eigenvalue decomposition
[V, D] = eig(covariance_matrix);
% Project data onto principal components
principal_components = X_centered * V;
% Display results
disp('Principal Components:');
disp(principal_components);

%ques 6
disp('ques 6 ')
data = readtable('[Link]');
meas = table2array(data(:, 1:end-1));
data_standardized = (meas - mean(meas)) ./ std(meas);
[coeff, score, ~, ~, explained] = pca(data_standardized);
figure;
plot(score(:, 1), score(:, 2), '^', 'MarkerFaceColor', 'b',
'MarkerEdgeColor', 'k');
title('PCA: First Two Principal Components of Iris Dataset');
xlabel('Principal Component 1');
ylabel('Principal Component 2');
grid on;
disp('Explained variance by each principal component:');
disp(explained);

%ques 7

2
disp('ques 7 ')
data = readtable('[Link]');
meas = table2array(data(:, 1:end-1));
data_standardized = (meas - mean(meas)) ./ std(meas);
[coeff, score, ~, ~, explained] = pca(data_standardized);
figure;
plot(score(:, 1), score(:, 2), '^', 'MarkerFaceColor', 'b',
'MarkerEdgeColor', 'k');
title('PCA: First Two Principal Components of cat Dataset');
xlabel('Principal Component 1');
ylabel('Principal Component 2');
grid on;
disp('Explained variance by each principal component:');
disp(explained);

ques 1

A =

2 3 -1
4 -1 2
-1 2 3

B =

5
6
4

X =

1.2857
1.1429
1.0000

solution of the linear system


1.2857
1.1429
1.0000

ques 2

A =

2 1 1
1 3 2
1 2 2

V =

0.1531 0.9018 0.4042

3
0.5665 -0.4153 0.7118
-0.8097 -0.1200 0.5744

D =

0.4116 0 0
0 1.4064 0
0 0 5.1819

Eigenvalues (diagonal of D)
0.4116
1.4064
5.1819

Eigenvectors (columns of V)
0.1531 0.9018 0.4042
0.5665 -0.4153 0.7118
-0.8097 -0.1200 0.5744

ques 3
ques4
Eigenvectors (V):
0.5665 0.4153 0.7118
0.1531 -0.9018 0.4042
-0.8097 0.1200 0.5744

Orthogonality Check (V' * V):


1.0000 0.0000 -0.0000
0.0000 1.0000 0.0000
-0.0000 0.0000 1.0000

Diagonalized Matrix (V * D * V'):


4.0000 1.0000 2.0000
1.0000 3.0000 1.0000
2.0000 1.0000 3.0000

ques 5
Principal Components:
0.2010 -0.4436
0.0550 2.1772
-0.3681 -0.5707
-0.0675 0.1290
0.1796 -1.2919

ques 6
Explained variance by each principal component:
72.7705
23.0305
3.6838
0.5152

ques 7
Explained variance by each principal component:

4
25.9492
9.8789
7.0156
6.4210
5.0806
3.1114
2.7060
2.2305
2.1577
1.9206
1.8559
1.6733
1.5534
1.5188
1.2035
1.1374
1.1028
1.0469
1.0059
0.9304
0.8821
0.8659
0.8247
0.7884
0.7530
0.7350
0.6945
0.6325
0.5896
0.5764
0.5684
0.5288
0.5103
0.5000
0.4725
0.4557
0.4468
0.4172
0.4113
0.4037
0.3795
0.3604
0.3591
0.3451
0.3366
0.3273
0.3182
0.3087
0.2958
0.2874
0.2858
0.2724
0.2618
0.2610

5
0.2542
0.2396
0.2342
0.2237
0.2220
0.2158
0.1992
0.1951
0.1889
0.1850
0.1785
0.1714
0.1535
0.1519
0.1403
0.1395
0.1320
0.1239
0.1162
0.1144
0.1074
0.1025
0.0976
0.0862
0.0713

6
7
Published with MATLAB® R2024b

Common questions

Powered by AI

Using PCA for dimensionality reduction in large datasets reduces computational complexity by minimizing the number of features needed for processing, thus speeding up machine learning algorithms. It retains essential data structures, improving model efficiency and performance on large datasets while capturing significant variance in condensed forms .

A symmetric matrix can be diagonalized using its eigenvectors and eigenvalues. Given a symmetric matrix A, the matrix is diagonalized as V * D * V', where V is the matrix of eigenvectors and D is the diagonal matrix of eigenvalues. The orthogonality of eigenvectors, checked as V' * V = I (the identity matrix), ensures that the transformation preserves distances and orthogonality in transformations, which is critical for simplifying matrix powers and understanding matrix behavior .

The explained variance by each principal component is crucial as it quantifies how much of the data's information, via variance, is captured by that component. It reveals the importance of each component in representing the overall dataset. In the Iris dataset, the first component explains 72.7705% of the variance, indicating a high level of feature representation, which simplifies data interpretation and analysis .

Eigenvectors serve as directions for data transformation in visualizations. In three-dimensional space, they indicate the axes along which data stretches or compresses when visualized, revealing inherent data structure and variability. This can be seen in plotting eigenvectors as arrows in 3D space representing direction and magnitude of variance, thus highlighting significant data features .

Eigenvalues and eigenvectors of a matrix are calculated through the eigendecomposition process. For matrix A = [2, 1, 1; 1, 3, 2; 1, 2, 2], the eigenvalues are 0.4116, 1.4064, and 5.1819, while the corresponding eigenvectors are [0.1531, 0.5665, -0.8097], [0.9018, -0.4153, -0.1200], and [0.4042, 0.7118, 0.5744]. These values are significant as they reveal intrinsic properties of the matrix such as stability and invariants under transformations .

In PCA of the Iris dataset, significant variance is captured by the first few components, with the first component explaining 72.7705% variance, which suggests a strong underlying pattern or distribution. In contrast, the PCA results for the cat dataset show a more distributed variance among its components, with the first two principal components explaining lesser variance individually, indicating a more complex or less structured data distribution .

Principal components are derived by centering the data, calculating the covariance matrix, and performing eigenvalue decomposition on it. The principal components are the projections of the data onto the eigenvectors of the covariance matrix. They signify the directions in which the data variance is maximized, helping reduce dimensionality while retaining significant patterns, as seen in the synthetic dataset example X = [2.5, 2.4; 0.5, 0.7; 2.2, 2.9; 1.9, 2.2; 3.1, 3.0], resulting in principal components derived from this process .

Standardizing data before applying PCA is crucial because it ensures that the data is on the same scale, which prevents features with large scales from dominating the principal components. It impacts the resulting components by making sure they reflect unbiased representations of data patterns across features not skewed due to variance in magnitude, as demonstrated in PCA applications to datasets like Iris and cat .

The correctness of a matrix diagonalization is verified by checking if V * D * V' reconstructs the original matrix. A correctly diagonalized matrix implies orthogonal eigenvectors, retained matrix properties, and simplifies operations like computing matrix powers or inverses, which enhances understanding of linear transformations .

The solution to the linear system X = A\B with the coefficient matrix A = [2, 3, -1; 4, -1, 2; -1, 2, 3] and right-hand side vector B = [5; 6; 4] is X = [1.2857; 1.1429; 1.0000].

You might also like