0% found this document useful (0 votes)
67 views36 pages

JML Regression

The document discusses linear regression and factor analysis techniques. It covers topics like loss functions, gradient descent, bias and variance, overfitting and underfitting, regularization, and the key steps in performing factor analysis including factor extraction, rotation, and determining the number of factors. Examples of different regression and machine learning algorithms are provided in terms of their bias and variance. The document also explains concepts like eigenvalues, factor loadings, and communalities which are important in factor analysis.

Uploaded by

pg ai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views36 pages

JML Regression

The document discusses linear regression and factor analysis techniques. It covers topics like loss functions, gradient descent, bias and variance, overfitting and underfitting, regularization, and the key steps in performing factor analysis including factor extraction, rotation, and determining the number of factors. Examples of different regression and machine learning algorithms are provided in terms of their bias and variance. The document also explains concepts like eigenvalues, factor loadings, and communalities which are important in factor analysis.

Uploaded by

pg ai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Regression

Regression
Linear Regression
▪ Linear Regression: Loss Function

▪ Gradient descent is an iterative optimization


algorithm to find the minimum of a function. 

w = w – alpha * delta
Bias and variance 
▪ High bias :good accuracy (test data)
▪ High variance: good accuracy score(training)

▪ Underfitting: high bias,low variance


▪ Overfitting:low bias,high variance
Fit lines
high variance
Low bias 

low variance
high bias 
▪ Examples of low-bias and high-variance  machine
learning algorithms include: Decision Trees, k-Nearest
Neighbors and Support Vector Machines.
▪ Examples of high-bias and  low-variance machine
learning algorithms include: Linear Regression, Linear
Discriminant Analysis and Logistic Regression.
▪ It is highly important to restrict the features while
modeling to minimize the risk of over fitting and this
process is called regularization.

▪ Loss=(y-yp)^2 linear
▪ Loss= loss+lambda (m)^2 ridge
Loss= loss+lambda (co-efficients)^2

Loss=loss+lambda(|m|)
Ridge Regression

It shrinks the parameters, therefore it is mostly used to


prevent multicollinearity.
It reduces the model complexity by coefficient shrinkage.
It uses L2 regularization technique. 
Lasso regression

▪ LASSO (Least Absolute Shrinkage Selector


Operator)
It uses L1 regularization technique
It is generally used when we have more
number of features, because it automatically
does feature selection.
Understanding Factor Analysis

 Factor analysis is commonly used in:


⚪ Data reduction
⚪ Scale development
⚪ The evaluation of the psychometric quality of a measure,
and
⚪ The assessment of the dimensionality of a set of variables.
▪ Factor analysis is a linear statistical model. It is used to
explain the variance among the observed variable and
condense a set of the observed variable into the unobserved
variable called factors. Observed variables are modeled as a
linear combination of factors and error terms
▪ How does factor analysis work?
▪ The primary objective of factor analysis is to reduce the
number of observed variables and find unobservable
variables. 
▪ Factor Extraction: In this step, the number of factors and
approach for extraction selected using variance partitioning
methods
▪ Factor Rotation: There are lots of rotation methods that are
available such as: Varimax rotation method, Quartimax
rotation method, and Promax rotation method.
▪ What is a factor?
▪ A factor is a latent variable which describes the association
among the number of observed variables. The maximum
number of factors are equal to a number of observed
variables. Every factor explains a certain variance in
observed variables. 
▪ What are the factor loadings?
▪ The factor loading is a matrix which shows the relationship of
each variable to the underlying factor.
▪ What is Eigenvalues?
▪ Eigenvalues represent variance explained each factor from
the total variance. It is also known as characteristic roots.
▪ What are Communalities?
▪ Commonalities are the sum of the squared loadings for each
variable. It represents the common variance.
Steps in Factor
Analysis

■ Factor analysis usually proceeds in four steps:


■ 1st Step: the correlation matrix for all variables is computed
■ 2nd Step: Factor extraction
■ 3rd Step: Factor rotation
■ 4th Step: Make final decisions about the number of underlying
factors
Steps in Factor Analysis: The
Correlation Matrix
 1st Step: the correlation matrix
■ Generate a correlation matrix for all variables
■ Identify variables not related to other variables
Steps in Factor Analysis: The
Correlation Matrix
⚪ Bartlett Test of Sphericity:
⚪ used to test the hypothesis the correlation matrix is an identity matrix
(all diagonal terms are 1 and all off-diagonal terms are 0).

⚪ If the value of the test statistic for sphericity is large and the
associated significance level is small, it is unlikely that the
population correlation matrix is an identity.

⚪ If the hypothesis that the population correlation matrix is an identity


cannot be rejected because the observed significance level is large,
the use of the factor model should be reconsidered.
Steps in Factor Analysis: The
Correlation Matrix
⚪ The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy:
⚪ is an index for comparing the magnitude of the observed
correlation coefficients to the magnitude of the partial correlation
coefficients.
⚪ The closer the KMO measure to 1 indicate a sizeable sampling adequacy
(.8 and higher are great, .7 is acceptable, .6 is mediocre, less than .5 is
unaccaptable ).

⚪ Reasonably large values are needed for a good factor analysis. Small KMO
values indicate that a factor analysis of the variables may not be a good
idea.
Steps in Factor Analysis:
Factor Extraction
 2nd Step: Factor extraction
 The primary objective of this stage is to determine the factors.
 Initial decisions can be made here about the number of factors
underlying a set of measured variables.
 Estimates of initial factors are obtained using Principal components
analysis.
 The principal components analysis is the most commonly used
extraction method . Other factor extraction methods include:
 Maximum likelihood method
 Principal axis factoring
 Alpha method
 Unweighted lease squares method
 Generalized least square method
 Image factoring.
Steps in Factor Analysis:
Factor Extraction
 In principal components analysis, linear combinations of the observed
variables are formed.

 The 1st principal component is the combination that accounts for the largest
amount of variance in the sample (1st extracted factor).

 The 2nd principle component accounts for the next largest amount of variance
and is uncorrelated with the first (2nd extracted factor).

 Successive components explain progressively smaller portions of the total


sample variance, and all are uncorrelated with each other.
Steps in Factor Analysis: Factor
Extraction
 To decide on how many factors we
need to represent the data, we
use 2 statistical criteria: Total Variance Explained

⚪ Eigen Values, and Extraction Sums of Squared

⚪ The Scree Plot. Initial Eigenvalues Loadings

Comp % of Cumulativ % of Cumulativ


Total Variance e% Total Variance e%
 The determination of the number onent
1 3.046 30.465 30.465 3.046 30.465 30.465
of factors is usually done by 2 1.801 18.011 48.476 1.801 18.011 48.476
considering only factors with Eigen 3 1.009 10.091 58.566 1.009 10.091 58.566
values greater than 1. 4 .934 9.336 67.902

5 .840 8.404 76.307


 Factors with a variance less than 1 6 .711 7.107 83.414

are no better than a single variable, 7 .574 5.737 89.151

since each variable is expected to 8 .440 4.396 93.547

have a variance of 1. 9 .337 3.368 96.915

10 .308 3.085 100.000

Extraction Method: Principal Component Analysis.


Steps in Factor Analysis: Factor
Extraction
20

 The examination of the Scree plot provides a


visual of the total variance associated with
each factor.

 The steep slope shows the large factors.

 The gradual trailing off (scree) shows the rest


of the factors usually lower than an Eigen
value of 1.

 In choosing the number of factors, in addition


to the statistical criteria, one should make
initial decisions based on conceptual and
theoretical grounds.

 At this stage, the decision about the number of


factors is not final.
Steps in Factor Analysis: Factor
Extraction
Component Matrix using Principle Component Analysis
Component Matrixa

Component

1 2 3

I discussed my frustrations and feelings with person(s) in school .771 -.271 .121

I tried to develop a step-by-step plan of action to remedy the problems .545 .530 .264

I expressed my emotions to my family and close friends .580 -.311 .265

I read, attended workshops, or sought someother educational approach to correct the .398 .356 -.374
problem

I tried to be emotionally honest with my self about the problems .436 .441 -.368

I sought advice from others on how I should solve the problems .705 -.362 .117

I explored the emotions caused by the problems .594 .184 -.537

I took direct action to try to correct the problems .074 .640 .443

I told someone I could trust about how I felt about the problems .752 -.351 .081

I put aside other activities so that I could work to solve the problems .225 .576 .272

Extraction Method: Principal Component Analysis.

a. 3 components extracted.
Steps in Factor Analysis:
Factor Rotation
 3rd Step: Factor rotation.
 In this step, factors are rotated.

 Un-rotated factors are typically not very interpretable (most factors are
correlated with may variables).

 Factors are rotated to make them more meaningful and easier to interpret
(each variable is associated with a minimal number of factors).

 Different rotation methods may result in the identification of


somewhat different factors.
Steps in Factor Analysis: Factor Rotation
 The most popular rotational method is Varimax rotations.

 Varimax use orthogonal rotations yielding uncorrelated


factors/components.

 Varimax attempts to minimize the number of variables that have high


loadings on a factor. This enhances the interpretability of the factors.
Steps in Factor Analysis: Factor
Rotation
 Other common rotational method used include Oblique rotations
which yield correlated factors.

 Oblique rotations are less frequently used because their results are
more difficult to summarize.

 Other rotational methods include:


 Quartimax (Orthogonal)
 Equamax (Orthogonal)
 Promax (oblique)
Component

1 2 3

I discussed my frustrations and feelings with person(s) in school .803 .186 .050

I tried to develop a step-by-step plan of action to remedy the problems .270 .304 .694

I expressed my emotions to my family and close friends .706 -.036 .059

I read, attended workshops, or sought someother educational approach to .050 .633 .145
correct the problem

I tried to be emotionally honest with my self about the problems .042 .685 .222

I sought advice from others on how I should solve the problems .792 .117 -.038

I explored the emotions caused by the problems .248 .782 -.037

I took direct action to try to correct the problems -.120 -.023 .772

I told someone I could trust about how I felt about the problems .815 .172 -.040

I put aside other activities so that I could work to solve the problems -.014 .155 .657
Steps in Factor Analysis:
Making Final Decisions

4th Step: Making final decisions
⚪ The final decision about the number of factors to choose is the number of
factors for the rotated solution that is most interpretable.
⚪ To identify factors, group variables that have large loadings for the same
factor.
⚪ Plots of loadings provide a visual for variable clusters.

⚪ Interpret factors according to the meaning of the variables


This decision should be guided
by:⚪ Atheory
priori conceptual beliefs about the number of factors from past research or

⚪ Eigen values computed in step 2.


⚪ The relative interpretability of rotated solutions computed in step 3.
Principal Component Analysis

▪ Principal Component analysis also known as PCA is


such a feature extraction method where we create new
independent features from the old features and from
combination of both keep only those features that are
most important in predicting the target. New features
are extracted from old features and any feature can be
dropped that is considered to be less dependent on the
target variable.
In the direction of largest variance the good line lies that is used for projection. 
It is needed to modify the coordinate system so as to retrieve 1D representation for vector y after the data gets
projected on the best line.
In the direction of the green line new data y and old data x have the same variance.
PCA maintains maximum variances in the data.
Doing PCA on n dimensions generates a new set of new n dimensions. Principal component takes care of the
maximum variance in the underlying data 1 and the other principal component is orthogonal to it that is 2.
PCA VS FA

▪ PCA components explain the maximum amount of variance while factor


analysis explains the covariance in data.
▪ PCA components are fully orthogonal to each other whereas factor
analysis does not require factors to be orthogonal.
▪ PCA component is a linear combination of the observed variable while in
FA, the observed variables are linear combinations of the unobserved
variable or factor.
▪ PCA components are uninterpretable. In FA, underlying factors are
labelable and interpretable.
▪ PCA is a kind of dimensionality reduction method whereas factor analysis
is the latent variable method.
▪ PCA is a type of factor analysis. PCA is observational whereas FA is a
modeling technique.

You might also like