0% found this document useful (0 votes)

111 views9 pages

Eigenvalues and PCA Analysis

The document discusses using principal component analysis and linear discriminant analysis for data analysis and fraud detection. It provides examples of analyzing eigenvalues and variance to determine the number of principal components as well as using loadings to identify variable contributions. Methods are given for computing hit rates and assessing the performance of models that use principal components for predicting transaction amounts and fraud.

Uploaded by

Professor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

111 views9 pages

Eigenvalues and PCA Analysis

Uploaded by

Professor

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

ICT513 Data Analytics

Student Name

Course

Tutor

Date
Question 1

Eigenvalues:

15736.1014,1062.97279,735.512612,685.740282,254.733435,228.10587,1.92805435

×10−2815736.1014,1062.97279,735.512612,685.740282,254.733435,228.10587,1.92

805435×10−28 0.84136029,0.89819413,0.93751969,0.97418409,0.98780389,1.0,1.0;

Cumulative Variance Ratio:

0.84136029,0.89819413,0.93751969,0.97418409,0.98780389,1.0,1.0

(a) Analysis of Eigenvalues

(i) Elbow Technique: To choose the quantity of principal components utilizing the

"elbow" technique, we search for where the eigenvalues begin to even out off. Here,

the eigenvalues begin to even out off after the third part.

The answer is three principal components .

Proof: The initial three eigenvalues are fundamentally bigger than the rest, showing

the "elbow" at the third part.

(ii) 90% Absolute Variety:

To represent 90% of the complete variety, we aggregate the combined change

proportions until we reach or surpass 0.90. In this case, the sum of the three

components' variances is greater than 0.90, which is 0.93751969.

The answer is three principal components .

Evidence: After taking into account the three components, the total variance is

0.93751969, which is greater than 90%.

(iii) Eigenvalue Cut-off of 1: We count the number of eigenvalues that are higher than

this threshold if we select components with eigenvalues greater than 1. Here, six

eigenvalues are more noteworthy than 1.

6 principal components.

Proof: The eigenvalues more noteworthy than 1 are 15736.1014, 1062.97279,

735.512612, 685.740282, 254.733435, and 228.10587, adding up to six parts.

(b) Last Eigenvalue

The worth of the last eigenvalue is 1.92805435×10−281.92805435×10−28. Reason:

This worth is very near zero in light of the fact that the last head part catches minimal

fluctuation in the information, basically addressing arbitrary commotion. This checks

out as PCA is intended to catch the most change with the initial not many parts,

leaving very little for the last part.

c) A biplot of the fourth and third primary components Code to deliver the biplot:

biplot(pca_result, decisions = c(3, 4), fundamental = "Biplot of PC3 and PC4")

Reply: In the biplot of PC3 and PC4, factors that heap also will have vectors pointing

in comparable headings. For instance, the vectors of HIT_POINTS and ATTACK,

which are close to one another, load similarly on these components.

(d) Second Head Part Loadings Code to extricate loadings: pca_result$rotation -

loadings pc2_loadings <-loadings[, 2] pc2_contributions

<-(pc2_loadings^2)/sum(pc2_loadings^2) * 100 print(pc2_contributions) The

loadings can be used to identify the two variables that have the greatest impact on the

second principal component. HIT_POINTS and ATTACK, for instance, would be the

most significant contributors if they have the highest loadings. Speculative model:

Attack: 25% HIT_POINTS: 30%

LDA

(e) Level of Partition by First Discriminant Capability Reply: To find the level of

detachment accomplished by the first discriminant capability: proportion_of_trace is

equal to sum(lda_result$svd2) minus lda_result$svd2 print(proportion_of_trace) On

the off chance that the result is 0.89,0.110.89,0.11, the first discriminant capability

represents 89% of the division. Reply: 89%

f) Function of the First Discriminant

To get the first discriminant function, here's how: print(scaling[, 1] of lda_result)

The result is: Attack: 0.45 Special Attack: 0.50 Special Defense: 0.30 Special

Defense: 0.40 Speed: 0.38 The first discriminant capability would be a blend of these

coefficients.

(g) Hit Rate

To compute the hit rate: predicted - predict(lda_result) ($class) hit_rate <-

mean(predicted == pokemon$TYPE_1) print(hit_rate) In the event that the hit rate is

0.72: 72 percent
Question 2

(a) Confirm Principal Components

To affirm that the factors addressing principal components are steady with what we

would expect, we can inspect their relationship lattice. Principal components ought to

be orthogonal (uncorrelated) with one another.

Explanation: Principal components ought to have close to no relationships with one

another, showing symmetry. The variables are consistent with the principal

components if the correlation matrix displays values off the diagonal that are close to

zero. Expected Results: a correlation matrix with nearly zero off-diagonal elements.
b) Transforming transaction amounts in a log

Exchange sums frequently should be log-changed in straight relapse models because

of the presence of skewness and heteroscedasticity in the information. Explanation:

Skewness: Typically, the distribution of transaction amounts is right-skewed. Log-

change can make the information more symmetric.

Heteroscedasticity: Fluctuation in exchange sums can increment with the actual sum.

Log-change can settle the difference.

c) Regression of Principal Components

To decide the ideal number of principal components for anticipating the log-changed

exchange sum utilizing 50 reiterations of ten times cross-approval, follow these

means:

Anticipated Result: a table containing MSE estimates for each number of principal

components (from one to fifteen).

The ideal number of principal components will be the one with the least MSE.

d) LDA for Detection of Fraud

Utilizing direct discriminant examination, decide the hit rate while thinking about all

factors in the dataset as logical factors in attempting to anticipate Visa

misrepresentation.

Anticipated Result: The success rate of using all variables to predict fraud.

e) Principal Components LDA Decide how the hit rate changes while thinking about

just principal components as logical factors.

Anticipated Result: The hit rate for anticipating misrepresentation utilizing principal

components. Qualitative Implications: When compared to the original variables, the

difference in hit rates can indicate how well the principal components capture the

necessary information for fraud detection. Principal components are frequently

utilized to reduce dimensionality without losing a significant amount of information,

so it may be surprising if the hit rates are significantly different.

f) Accounting for the costs of fraud Calculation of New Priorities:

Change in Hit Rate:

Estimated Savings:

Expected Results: New Priors: Extents of the two classes with the new expense

proportion. Change in

Hit Rate: Hit rate subsequent to adapting to misrepresentation costs. Estimated

Savings: The savings that result from taking into account the price of missing

fraudulent transactions.

It ML Unit 4 Notes Final
No ratings yet
It ML Unit 4 Notes Final
21 pages
Unit 3
No ratings yet
Unit 3
28 pages
Education - Post 12th Standard - CSV
No ratings yet
Education - Post 12th Standard - CSV
11 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
ML Unit - 3 DimensionalitY Reduction
No ratings yet
ML Unit - 3 DimensionalitY Reduction
39 pages
What Is Principal Component Analysis (PCA) ?
No ratings yet
What Is Principal Component Analysis (PCA) ?
13 pages
Lesson 7-Feature Selection and Principal Component Analysis
No ratings yet
Lesson 7-Feature Selection and Principal Component Analysis
24 pages
Understanding Principal Component Analysis
No ratings yet
Understanding Principal Component Analysis
22 pages
PCA Analysis of Consumer Data Insights
100% (1)
PCA Analysis of Consumer Data Insights
5 pages
Education - Post 12th Standard - CSV
88% (16)
Education - Post 12th Standard - CSV
11 pages
Dimensionality Reduction Techniques
No ratings yet
Dimensionality Reduction Techniques
82 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Unit 3: Discriminant Analysis and Cluster Analysis
No ratings yet
Unit 3: Discriminant Analysis and Cluster Analysis
43 pages
ACPusing R
No ratings yet
ACPusing R
25 pages
Eigenvalues in Factor Analysis Explained
No ratings yet
Eigenvalues in Factor Analysis Explained
57 pages
Lecture FPCA
No ratings yet
Lecture FPCA
67 pages
Principal Component Analysis Steps
No ratings yet
Principal Component Analysis Steps
14 pages
PCA and LDA Assignment
No ratings yet
PCA and LDA Assignment
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
19 pages
Unit 4
No ratings yet
Unit 4
17 pages
6 Dimension Reduction Theory
No ratings yet
6 Dimension Reduction Theory
18 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Pca Fa Data
No ratings yet
Pca Fa Data
8 pages
SVD and PCA in Data Science
No ratings yet
SVD and PCA in Data Science
58 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Understanding PCA in Unsupervised Learning
No ratings yet
Understanding PCA in Unsupervised Learning
17 pages
08PCA
No ratings yet
08PCA
21 pages
Unsupervised Learning & PCA Guide
No ratings yet
Unsupervised Learning & PCA Guide
82 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Peter D. Hoff - Multivariate Statistical Analysis
No ratings yet
Peter D. Hoff - Multivariate Statistical Analysis
174 pages
Principal Component Analysis: by Eesha Tur Razia Babar
No ratings yet
Principal Component Analysis: by Eesha Tur Razia Babar
38 pages
8 Dimensionality Reduction
No ratings yet
8 Dimensionality Reduction
49 pages
Dimensionality Reduction & Models
No ratings yet
Dimensionality Reduction & Models
59 pages
Unit 5 (Dimensionality Reduction)
No ratings yet
Unit 5 (Dimensionality Reduction)
96 pages
hst951 7
No ratings yet
hst951 7
32 pages
Principal Component Analysis (PCA) Final
No ratings yet
Principal Component Analysis (PCA) Final
37 pages
PCA and Clustering Analysis Guide
No ratings yet
PCA and Clustering Analysis Guide
20 pages
Anomaly Component Analysis Explained
No ratings yet
Anomaly Component Analysis Explained
41 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
Unit 3
No ratings yet
Unit 3
21 pages
Feature Extraction in Machine Learning
No ratings yet
Feature Extraction in Machine Learning
17 pages
Understanding DA and AI Concepts
No ratings yet
Understanding DA and AI Concepts
84 pages
Exp7
No ratings yet
Exp7
7 pages
Large-Dimensional Factor Analysis With Weighted PCA
No ratings yet
Large-Dimensional Factor Analysis With Weighted PCA
110 pages
Data Analysis for Market Segmentation
No ratings yet
Data Analysis for Market Segmentation
36 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Module 4
No ratings yet
Module 4
48 pages
Dsur I Chapter 17 Efa
No ratings yet
Dsur I Chapter 17 Efa
47 pages
MLSP Exp02
No ratings yet
MLSP Exp02
10 pages
Assignment 3A 2024
No ratings yet
Assignment 3A 2024
4 pages
Sessional - 2, April 2023 Machine Learning (DSE 2254), IV Sem, DSE Date: 19/04/2023 Max. Marks: 15
No ratings yet
Sessional - 2, April 2023 Machine Learning (DSE 2254), IV Sem, DSE Date: 19/04/2023 Max. Marks: 15
7 pages
ML - Unit 3
No ratings yet
ML - Unit 3
4 pages
Determining Metacentric Height Experiment
No ratings yet
Determining Metacentric Height Experiment
4 pages
Scissor Mechanism DoMS
No ratings yet
Scissor Mechanism DoMS
20 pages
Mitigating Higher-Mode Effects in Tall Buildings
No ratings yet
Mitigating Higher-Mode Effects in Tall Buildings
12 pages
1968 McClintock
No ratings yet
1968 McClintock
9 pages
Conduction & Breakdown in Gases
No ratings yet
Conduction & Breakdown in Gases
20 pages
Photographs and Memories PDF
No ratings yet
Photographs and Memories PDF
36 pages
Elevating Safety: The Magazine of The Electroindustry
No ratings yet
Elevating Safety: The Magazine of The Electroindustry
36 pages
Year 7 Maths Simplification Exercises
No ratings yet
Year 7 Maths Simplification Exercises
3 pages
Cross Flow Report in MUJ
No ratings yet
Cross Flow Report in MUJ
15 pages
Test Review Problems Circular Motion
No ratings yet
Test Review Problems Circular Motion
4 pages
Displacement Functions in FEM
No ratings yet
Displacement Functions in FEM
15 pages
Civil Engineering Interview Questions and Answers
No ratings yet
Civil Engineering Interview Questions and Answers
2 pages
Design of Fluid Thermal Systems 4th Edition by Janna ISBN Solution Manual
100% (66)
Design of Fluid Thermal Systems 4th Edition by Janna ISBN Solution Manual
36 pages
Arkani-Hamed Et Al. - 1998 - The Hierarchy Problem and New Dimensions at A Millimeter
No ratings yet
Arkani-Hamed Et Al. - 1998 - The Hierarchy Problem and New Dimensions at A Millimeter
10 pages
(Joseph Keddie, Alexander F. Routh) Fundamentals of Latex Film Formation
50% (2)
(Joseph Keddie, Alexander F. Routh) Fundamentals of Latex Film Formation
299 pages
Granular Sub Base Material Specifications
No ratings yet
Granular Sub Base Material Specifications
4 pages
ASTEC Generalities
No ratings yet
ASTEC Generalities
119 pages
Sony Dsc-h10 Service Manual Le
No ratings yet
Sony Dsc-h10 Service Manual Le
69 pages
AJC H2Maths 2012prelim P2 Question
No ratings yet
AJC H2Maths 2012prelim P2 Question
6 pages
John Forbes Nash
No ratings yet
John Forbes Nash
10 pages
11th Class Maths (Send Up) Paper
No ratings yet
11th Class Maths (Send Up) Paper
5 pages
BSF Si Final A8fe71b8
No ratings yet
BSF Si Final A8fe71b8
7 pages
K Label Band Structure 2
No ratings yet
K Label Band Structure 2
45 pages
Limitations in Using Euler's Formula in The Design of Heat Exchanger Networks With Pinch Technology
No ratings yet
Limitations in Using Euler's Formula in The Design of Heat Exchanger Networks With Pinch Technology
5 pages
Electromagnetic Induction
No ratings yet
Electromagnetic Induction
64 pages
Ross
No ratings yet
Ross
816 pages
Process Control
No ratings yet
Process Control
21 pages
Caffeine MJC
No ratings yet
Caffeine MJC
8 pages
Born Approximation Validity Conditions: Ik X 2 3 0 Ik - X X - 0 0 0
No ratings yet
Born Approximation Validity Conditions: Ik X 2 3 0 Ik - X X - 0 0 0
1 page
Dagdagpur Disty Canal Alignment Plan at KM RD-02+000 TO 03+000
No ratings yet
Dagdagpur Disty Canal Alignment Plan at KM RD-02+000 TO 03+000
1 page

Eigenvalues and PCA Analysis

Uploaded by

Eigenvalues and PCA Analysis

Uploaded by

ICT513 Data Analytics

Cumulative Variance Ratio:

(a) Analysis of Eigenvalues

The answer is three principal components .

the "elbow" at the third part.

(ii) 90% Absolute Variety:

To represent 90% of the complete variety, we aggregate the combined change

components' variances is greater than 0.90, which is 0.93751969.

0.93751969, which is greater than 90%.

eigenvalues are more noteworthy than 1.

Proof: The eigenvalues more noteworthy than 1 are 15736.1014, 1062.97279,

735.512612, 685.740282, 254.733435, and 228.10587, adding up to six parts.

(b) Last Eigenvalue

The worth of the last eigenvalue is 1.92805435×10−281.92805435×10−28. Reason:

fluctuation in the information, basically addressing arbitrary commotion. This checks

leaving very little for the last part.

biplot(pca_result, decisions = c(3, 4), fundamental = "Biplot of PC3 and PC4")

in comparable headings. For instance, the vectors of HIT_POINTS and ATTACK,

which are close to one another, load similarly on these components.

(d) Second Head Part Loadings Code to extricate loadings: pca_result$rotation -

loadings pc2_loadings <-loadings[, 2] pc2_contributions

Attack: 25% HIT_POINTS: 30%

detachment accomplished by the first discriminant capability: proportion_of_trace is

equal to sum(lda_result$svd2) minus lda_result$svd2 print(proportion_of_trace) On

represents 89% of the division. Reply: 89%

f) Function of the First Discriminant

To get the first discriminant function, here's how: print(scaling[, 1] of lda_result)

(g) Hit Rate

To compute the hit rate: predicted - predict(lda_result) ($class) hit_rate <-

mean(predicted == pokemon$TYPE_1) print(hit_rate) In the event that the hit rate is

(a) Confirm Principal Components

be orthogonal (uncorrelated) with one another.

Explanation: Principal components ought to have close to no relationships with one

Exchange sums frequently should be log-changed in straight relapse models because

of the presence of skewness and heteroscedasticity in the information. Explanation:

Skewness: Typically, the distribution of transaction amounts is right-skewed. Log-

change can make the information more symmetric.

Log-change can settle the difference.

c) Regression of Principal Components

exchange sum utilizing 50 reiterations of ten times cross-approval, follow these

components (from one to fifteen).

d) LDA for Detection of Fraud

factors in the dataset as logical factors in attempting to anticipate Visa

just principal components as logical factors.

components. Qualitative Implications: When compared to the original variables, the

necessary information for fraud detection. Principal components are frequently

so it may be surprising if the hit rates are significantly different.

f) Accounting for the costs of fraud Calculation of New Priorities:

Change in Hit Rate:

Hit Rate: Hit rate subsequent to adapting to misrepresentation costs. Estimated

You might also like