0% found this document useful (0 votes)

67 views36 pages

JML Regression

The document discusses linear regression and factor analysis techniques. It covers topics like loss functions, gradient descent, bias and variance, overfitting and underfitting, regularization, and the key steps in performing factor analysis including factor extraction, rotation, and determining the number of factors. Examples of different regression and machine learning algorithms are provided in terms of their bias and variance. The document also explains concepts like eigenvalues, factor loadings, and communalities which are important in factor analysis.

Uploaded by

pg ai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views36 pages

JML Regression

Uploaded by

pg ai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Regression

Regression
Linear Regression
▪ Linear Regression: Loss Function

▪ Gradient descent is an iterative optimization

algorithm to find the minimum of a function.

w = w – alpha * delta
Bias and variance
▪ High bias :good accuracy (test data)
▪ High variance: good accuracy score(training)

▪ Underfitting: high bias,low variance

▪ Overfitting:low bias,high variance
Fit lines
high variance
Low bias

low variance
high bias
▪ Examples of low-bias and high-variance machine
learning algorithms include: Decision Trees, k-Nearest
Neighbors and Support Vector Machines.
▪ Examples of high-bias and low-variance machine
learning algorithms include: Linear Regression, Linear
Discriminant Analysis and Logistic Regression.
▪ It is highly important to restrict the features while
modeling to minimize the risk of over fitting and this
process is called regularization.

▪ Loss=(y-yp)^2 linear
▪ Loss= loss+lambda (m)^2 ridge
Loss= loss+lambda (co-efficients)^2

Loss=loss+lambda(|m|)
Ridge Regression

It shrinks the parameters, therefore it is mostly used to

prevent multicollinearity.
It reduces the model complexity by coefficient shrinkage.
It uses L2 regularization technique.
Lasso regression

▪ LASSO (Least Absolute Shrinkage Selector

Operator)
It uses L1 regularization technique
It is generally used when we have more
number of features, because it automatically
does feature selection.
Understanding Factor Analysis

 Factor analysis is commonly used in:

⚪ Data reduction
⚪ Scale development
⚪ The evaluation of the psychometric quality of a measure,
and
⚪ The assessment of the dimensionality of a set of variables.
▪ Factor analysis is a linear statistical model. It is used to
explain the variance among the observed variable and
condense a set of the observed variable into the unobserved
variable called factors. Observed variables are modeled as a
linear combination of factors and error terms
▪ How does factor analysis work?
▪ The primary objective of factor analysis is to reduce the
number of observed variables and find unobservable
variables.
▪ Factor Extraction: In this step, the number of factors and
approach for extraction selected using variance partitioning
methods
▪ Factor Rotation: There are lots of rotation methods that are
available such as: Varimax rotation method, Quartimax
rotation method, and Promax rotation method.
▪ What is a factor?
▪ A factor is a latent variable which describes the association
among the number of observed variables. The maximum
number of factors are equal to a number of observed
variables. Every factor explains a certain variance in
observed variables.
▪ What are the factor loadings?
▪ The factor loading is a matrix which shows the relationship of
each variable to the underlying factor.
▪ What is Eigenvalues?
▪ Eigenvalues represent variance explained each factor from
the total variance. It is also known as characteristic roots.
▪ What are Communalities?
▪ Commonalities are the sum of the squared loadings for each
variable. It represents the common variance.
Steps in Factor
Analysis

■ Factor analysis usually proceeds in four steps:

■ 1st Step: the correlation matrix for all variables is computed
■ 2nd Step: Factor extraction
■ 3rd Step: Factor rotation
■ 4th Step: Make final decisions about the number of underlying
factors
Steps in Factor Analysis: The
Correlation Matrix
 1st Step: the correlation matrix
■ Generate a correlation matrix for all variables
■ Identify variables not related to other variables
Steps in Factor Analysis: The
Correlation Matrix
⚪ Bartlett Test of Sphericity:
⚪ used to test the hypothesis the correlation matrix is an identity matrix
(all diagonal terms are 1 and all off-diagonal terms are 0).

⚪ If the value of the test statistic for sphericity is large and the
associated significance level is small, it is unlikely that the
population correlation matrix is an identity.

⚪ If the hypothesis that the population correlation matrix is an identity

cannot be rejected because the observed significance level is large,
the use of the factor model should be reconsidered.
Steps in Factor Analysis: The
Correlation Matrix
⚪ The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy:
⚪ is an index for comparing the magnitude of the observed
correlation coefficients to the magnitude of the partial correlation
coefficients.
⚪ The closer the KMO measure to 1 indicate a sizeable sampling adequacy
(.8 and higher are great, .7 is acceptable, .6 is mediocre, less than .5 is
unaccaptable ).

⚪ Reasonably large values are needed for a good factor analysis. Small KMO
values indicate that a factor analysis of the variables may not be a good
idea.
Steps in Factor Analysis:
Factor Extraction
 2nd Step: Factor extraction
 The primary objective of this stage is to determine the factors.
 Initial decisions can be made here about the number of factors
underlying a set of measured variables.
 Estimates of initial factors are obtained using Principal components
analysis.
 The principal components analysis is the most commonly used
extraction method . Other factor extraction methods include:
 Maximum likelihood method
 Principal axis factoring
 Alpha method
 Unweighted lease squares method
 Generalized least square method
 Image factoring.
Steps in Factor Analysis:
Factor Extraction
 In principal components analysis, linear combinations of the observed
variables are formed.

 The 1st principal component is the combination that accounts for the largest
amount of variance in the sample (1st extracted factor).

 The 2nd principle component accounts for the next largest amount of variance
and is uncorrelated with the first (2nd extracted factor).

 Successive components explain progressively smaller portions of the total

sample variance, and all are uncorrelated with each other.
Steps in Factor Analysis: Factor
Extraction
 To decide on how many factors we
need to represent the data, we
use 2 statistical criteria: Total Variance Explained

⚪ Eigen Values, and Extraction Sums of Squared

⚪ The Scree Plot. Initial Eigenvalues Loadings

Comp % of Cumulativ % of Cumulativ

Total Variance e% Total Variance e%
 The determination of the number onent
1 3.046 30.465 30.465 3.046 30.465 30.465
of factors is usually done by 2 1.801 18.011 48.476 1.801 18.011 48.476
considering only factors with Eigen 3 1.009 10.091 58.566 1.009 10.091 58.566
values greater than 1. 4 .934 9.336 67.902

5 .840 8.404 76.307

 Factors with a variance less than 1 6 .711 7.107 83.414

are no better than a single variable, 7 .574 5.737 89.151

since each variable is expected to 8 .440 4.396 93.547

have a variance of 1. 9 .337 3.368 96.915

10 .308 3.085 100.000

Extraction Method: Principal Component Analysis.

Steps in Factor Analysis: Factor
Extraction
20

 The examination of the Scree plot provides a

visual of the total variance associated with
each factor.

 The steep slope shows the large factors.

 The gradual trailing off (scree) shows the rest

of the factors usually lower than an Eigen
value of 1.

 In choosing the number of factors, in addition

to the statistical criteria, one should make
initial decisions based on conceptual and
theoretical grounds.

 At this stage, the decision about the number of

factors is not final.
Steps in Factor Analysis: Factor
Extraction
Component Matrix using Principle Component Analysis
Component Matrixa

Component

1 2 3

I discussed my frustrations and feelings with person(s) in school .771 -.271 .121

I tried to develop a step-by-step plan of action to remedy the problems .545 .530 .264

I expressed my emotions to my family and close friends .580 -.311 .265

I read, attended workshops, or sought someother educational approach to correct the .398 .356 -.374
problem

I tried to be emotionally honest with my self about the problems .436 .441 -.368

I sought advice from others on how I should solve the problems .705 -.362 .117

I explored the emotions caused by the problems .594 .184 -.537

I took direct action to try to correct the problems .074 .640 .443

I told someone I could trust about how I felt about the problems .752 -.351 .081

I put aside other activities so that I could work to solve the problems .225 .576 .272

Extraction Method: Principal Component Analysis.

a. 3 components extracted.
Steps in Factor Analysis:
Factor Rotation
 3rd Step: Factor rotation.
 In this step, factors are rotated.

 Un-rotated factors are typically not very interpretable (most factors are
correlated with may variables).

 Factors are rotated to make them more meaningful and easier to interpret
(each variable is associated with a minimal number of factors).

 Different rotation methods may result in the identification of

somewhat different factors.
Steps in Factor Analysis: Factor Rotation
 The most popular rotational method is Varimax rotations.

 Varimax use orthogonal rotations yielding uncorrelated

factors/components.

 Varimax attempts to minimize the number of variables that have high

loadings on a factor. This enhances the interpretability of the factors.
Steps in Factor Analysis: Factor
Rotation
 Other common rotational method used include Oblique rotations
which yield correlated factors.

 Oblique rotations are less frequently used because their results are
more difficult to summarize.

 Other rotational methods include:

 Quartimax (Orthogonal)
 Equamax (Orthogonal)
 Promax (oblique)
Component

1 2 3

I discussed my frustrations and feelings with person(s) in school .803 .186 .050

I tried to develop a step-by-step plan of action to remedy the problems .270 .304 .694

I expressed my emotions to my family and close friends .706 -.036 .059

I read, attended workshops, or sought someother educational approach to .050 .633 .145
correct the problem

I tried to be emotionally honest with my self about the problems .042 .685 .222

I sought advice from others on how I should solve the problems .792 .117 -.038

I explored the emotions caused by the problems .248 .782 -.037

I took direct action to try to correct the problems -.120 -.023 .772

I told someone I could trust about how I felt about the problems .815 .172 -.040

I put aside other activities so that I could work to solve the problems -.014 .155 .657
Steps in Factor Analysis:
Making Final Decisions

4th Step: Making final decisions
⚪ The final decision about the number of factors to choose is the number of
factors for the rotated solution that is most interpretable.
⚪ To identify factors, group variables that have large loadings for the same
factor.
⚪ Plots of loadings provide a visual for variable clusters.

⚪ Interpret factors according to the meaning of the variables


This decision should be guided
by:⚪ Atheory
priori conceptual beliefs about the number of factors from past research or

⚪ Eigen values computed in step 2.

⚪ The relative interpretability of rotated solutions computed in step 3.
Principal Component Analysis

▪ Principal Component analysis also known as PCA is

such a feature extraction method where we create new
independent features from the old features and from
combination of both keep only those features that are
most important in predicting the target. New features
are extracted from old features and any feature can be
dropped that is considered to be less dependent on the
target variable.
In the direction of largest variance the good line lies that is used for projection.
It is needed to modify the coordinate system so as to retrieve 1D representation for vector y after the data gets
projected on the best line.
In the direction of the green line new data y and old data x have the same variance.
PCA maintains maximum variances in the data.
Doing PCA on n dimensions generates a new set of new n dimensions. Principal component takes care of the
maximum variance in the underlying data 1 and the other principal component is orthogonal to it that is 2.
PCA VS FA

▪ PCA components explain the maximum amount of variance while factor

analysis explains the covariance in data.
▪ PCA components are fully orthogonal to each other whereas factor
analysis does not require factors to be orthogonal.
▪ PCA component is a linear combination of the observed variable while in
FA, the observed variables are linear combinations of the unobserved
variable or factor.
▪ PCA components are uninterpretable. In FA, underlying factors are
labelable and interpretable.
▪ PCA is a kind of dimensionality reduction method whereas factor analysis
is the latent variable method.
▪ PCA is a type of factor analysis. PCA is observational whereas FA is a
modeling technique.

Factor Analysis Notes
No ratings yet
Factor Analysis Notes
11 pages
Key Factors in Factor Analysis
No ratings yet
Key Factors in Factor Analysis
22 pages
Introduction To Factor Analysis (Compatibility Mode) PDF
No ratings yet
Introduction To Factor Analysis (Compatibility Mode) PDF
20 pages
Factor Analysis
No ratings yet
Factor Analysis
27 pages
Factor Analysis for Data Experts
No ratings yet
Factor Analysis for Data Experts
39 pages
Factor Analysis Guide: Techniques & Applications
No ratings yet
Factor Analysis Guide: Techniques & Applications
22 pages
Steps in Conducting Factor Analysis
100% (1)
Steps in Conducting Factor Analysis
3 pages
Factor Analysis
No ratings yet
Factor Analysis
4 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
19 pages
Factor Analysis
No ratings yet
Factor Analysis
27 pages
Factor Analysis (FA)
No ratings yet
Factor Analysis (FA)
61 pages
Factor Analysis: Nazia Qayyum SAP ID 48541
100% (1)
Factor Analysis: Nazia Qayyum SAP ID 48541
34 pages
Business Research Method: Factor Analysis
100% (1)
Business Research Method: Factor Analysis
52 pages
Session 1.4 Factor Analysis Notes
No ratings yet
Session 1.4 Factor Analysis Notes
23 pages
Exploratory Factor Analysis: Prepared By: DR Gurjeet Kaur IIM, Amritsar
No ratings yet
Exploratory Factor Analysis: Prepared By: DR Gurjeet Kaur IIM, Amritsar
18 pages
Sessions 21-24 Factor Analysis - Ppt-Rev
No ratings yet
Sessions 21-24 Factor Analysis - Ppt-Rev
61 pages
Factor Analysis & Rotation
No ratings yet
Factor Analysis & Rotation
9 pages
Factor Analysis
No ratings yet
Factor Analysis
18 pages
Types of Factor Analysis
No ratings yet
Types of Factor Analysis
7 pages
Understanding Factor Analysis Techniques
No ratings yet
Understanding Factor Analysis Techniques
61 pages
Factor Analysis
No ratings yet
Factor Analysis
11 pages
Factor Analysis in Python: Key Concepts
No ratings yet
Factor Analysis in Python: Key Concepts
23 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
48 pages
Factor Analysis Methodology Guide
No ratings yet
Factor Analysis Methodology Guide
31 pages
Factor Analysis
No ratings yet
Factor Analysis
20 pages
The Steps in Factor Analysis
No ratings yet
The Steps in Factor Analysis
12 pages
Brake Design Report
No ratings yet
Brake Design Report
27 pages
Unit 12
No ratings yet
Unit 12
15 pages
Factor Analysis: A Step-by-Step Guide
No ratings yet
Factor Analysis: A Step-by-Step Guide
5 pages
Session 7 Factor Analysis
No ratings yet
Session 7 Factor Analysis
24 pages
Understanding PCA and Factor Analysis
No ratings yet
Understanding PCA and Factor Analysis
3 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
35 pages
Factor Analysis
No ratings yet
Factor Analysis
4 pages
Factor Analysis: Interdependence Technique
No ratings yet
Factor Analysis: Interdependence Technique
22 pages
Annotated SPSS Output Factor Analysis
No ratings yet
Annotated SPSS Output Factor Analysis
20 pages
Factor Analysis
No ratings yet
Factor Analysis
4 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
12 pages
SPSS Annotated Output Factor Analysis
No ratings yet
SPSS Annotated Output Factor Analysis
11 pages
2b Factor Anaysis
No ratings yet
2b Factor Anaysis
24 pages
wk2 Factor-Analysis
No ratings yet
wk2 Factor-Analysis
35 pages
R Factor Analysis Guide
No ratings yet
R Factor Analysis Guide
4 pages
Factor Analysis: A Brief Introduction
No ratings yet
Factor Analysis: A Brief Introduction
4 pages
Factor Analysis Techniques Guide
No ratings yet
Factor Analysis Techniques Guide
20 pages
Factor Analysis
100% (2)
Factor Analysis
36 pages
Factor Analysis
No ratings yet
Factor Analysis
36 pages
Unit 5
No ratings yet
Unit 5
36 pages
Factoranalysis
No ratings yet
Factoranalysis
36 pages
Factor Analysis
No ratings yet
Factor Analysis
36 pages
Factor Analysis
No ratings yet
Factor Analysis
36 pages
Chapter 19: Factor Analysis: Advance Marketing Research
No ratings yet
Chapter 19: Factor Analysis: Advance Marketing Research
37 pages
BRM Report
No ratings yet
BRM Report
16 pages
Factor Analysis
No ratings yet
Factor Analysis
39 pages
Factor Analysis
No ratings yet
Factor Analysis
36 pages
Factor Analysis
No ratings yet
Factor Analysis
24 pages
RT Project Documentation-Sample
No ratings yet
RT Project Documentation-Sample
22 pages
Case Problem 1 Heavenly Chocolates Website Transactions
No ratings yet
Case Problem 1 Heavenly Chocolates Website Transactions
6 pages
Cambridge International Advanced Level
No ratings yet
Cambridge International Advanced Level
8 pages
A Structural Equation Model of Successful Aging in Korean Older Women Using Selection Optimization Compensation SOC Strategies
No ratings yet
A Structural Equation Model of Successful Aging in Korean Older Women Using Selection Optimization Compensation SOC Strategies
17 pages
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
No ratings yet
The Multiple Linear Regression Model: Version: 30-10-2023, 16:07
17 pages
Marketing Research An Applied Approach 3rd Edition Naresh K. Malhotra
No ratings yet
Marketing Research An Applied Approach 3rd Edition Naresh K. Malhotra
459 pages
Hardjomuljadi 2014, Factor Analysis On Causal of Construction Claims and Disputes in Indonesia (With Reference To The Construction of Hydroelectric Power Project in Indonesia)
No ratings yet
Hardjomuljadi 2014, Factor Analysis On Causal of Construction Claims and Disputes in Indonesia (With Reference To The Construction of Hydroelectric Power Project in Indonesia)
25 pages
Revision Sheet 5 - Stats 1
No ratings yet
Revision Sheet 5 - Stats 1
10 pages
EBSCO-FullText-11 06 2025
No ratings yet
EBSCO-FullText-11 06 2025
17 pages
Data Analytics With R
No ratings yet
Data Analytics With R
33 pages
Reducing Student Burnout in Schools
No ratings yet
Reducing Student Burnout in Schools
13 pages
CHO - Statistics
No ratings yet
CHO - Statistics
6 pages
From Insights To Impact - Leveraging Data Analytics
No ratings yet
From Insights To Impact - Leveraging Data Analytics
8 pages
Amul Milk Cs
No ratings yet
Amul Milk Cs
80 pages
The Impacts Flooding Vendors' Dhaka: of Urban On Street
No ratings yet
The Impacts Flooding Vendors' Dhaka: of Urban On Street
15 pages
Arciaga Levin A. Detailed Lesson Plan For Demonstration Teaching 1
No ratings yet
Arciaga Levin A. Detailed Lesson Plan For Demonstration Teaching 1
5 pages
Goris-The Relation Between Preference For Predictability and Autistic Traits
No ratings yet
Goris-The Relation Between Preference For Predictability and Autistic Traits
35 pages
ECA Impact on Academic Performance at Tinago
No ratings yet
ECA Impact on Academic Performance at Tinago
13 pages
The Relationship Between Textism and Filipino Literacy Skills
No ratings yet
The Relationship Between Textism and Filipino Literacy Skills
60 pages
Diaz-Linhart Et Al. Families and Workers - Work Voice - Report 11 09 2023 Final
No ratings yet
Diaz-Linhart Et Al. Families and Workers - Work Voice - Report 11 09 2023 Final
25 pages
MBA - 1st - Yr - Banking & Financial Services - 2022 - 23 - Final
No ratings yet
MBA - 1st - Yr - Banking & Financial Services - 2022 - 23 - Final
40 pages
Numpy Matrix Analysis Task
No ratings yet
Numpy Matrix Analysis Task
6 pages
Bba CSR 31
No ratings yet
Bba CSR 31
34 pages
Midterm Activities: Philosophy & Psychology
No ratings yet
Midterm Activities: Philosophy & Psychology
8 pages
Wgu C784 Applied Healthcare Statistics Final Exam
100% (1)
Wgu C784 Applied Healthcare Statistics Final Exam
7 pages
Environmetrics - 2022 - Li - Changepoint Detection in Autocorrelated Ordinal Categorical Time Series
No ratings yet
Environmetrics - 2022 - Li - Changepoint Detection in Autocorrelated Ordinal Categorical Time Series
22 pages
EJABM InfluenceofReviewQualityReviewQuantityandReviewCredibility
No ratings yet
EJABM InfluenceofReviewQualityReviewQuantityandReviewCredibility
17 pages
Lean Supply Chain Impact on Kenyan Manufacturing
No ratings yet
Lean Supply Chain Impact on Kenyan Manufacturing
15 pages
The Association Between Peer Rejection and Aggression Types Zhang
No ratings yet
The Association Between Peer Rejection and Aggression Types Zhang
14 pages
1-LA Multivariate Geary
No ratings yet
1-LA Multivariate Geary
28 pages

JML Regression

Uploaded by

JML Regression

Uploaded by

Regression

▪ Gradient descent is an iterative optimization

▪ Underfitting: high bias,low variance

It shrinks the parameters, therefore it is mostly used to

▪ LASSO (Least Absolute Shrinkage Selector

 Factor analysis is commonly used in:

■ Factor analysis usually proceeds in four steps:

⚪ If the hypothesis that the population correlation matrix is an identity

 Successive components explain progressively smaller portions of the total

⚪ Eigen Values, and Extraction Sums of Squared

⚪ The Scree Plot. Initial Eigenvalues Loadings

Comp % of Cumulativ % of Cumulativ

5 .840 8.404 76.307

are no better than a single variable, 7 .574 5.737 89.151

since each variable is expected to 8 .440 4.396 93.547

have a variance of 1. 9 .337 3.368 96.915

10 .308 3.085 100.000

Extraction Method: Principal Component Analysis.

 The examination of the Scree plot provides a

 The steep slope shows the large factors.

 The gradual trailing off (scree) shows the rest

 In choosing the number of factors, in addition

 At this stage, the decision about the number of

I expressed my emotions to my family and close friends .580 -.311 .265

I explored the emotions caused by the problems .594 .184 -.537

Extraction Method: Principal Component Analysis.

 Different rotation methods may result in the identification of

 Varimax use orthogonal rotations yielding uncorrelated

 Varimax attempts to minimize the number of variables that have high

 Other rotational methods include:

I expressed my emotions to my family and close friends .706 -.036 .059

I explored the emotions caused by the problems .248 .782 -.037

⚪ Interpret factors according to the meaning of the variables

⚪ Eigen values computed in step 2.

▪ Principal Component analysis also known as PCA is

▪ PCA components explain the maximum amount of variance while factor

You might also like