0% found this document useful (0 votes)

26 views45 pages

Inferential Analysis

Uploaded by

Abdul Rahman Fata Nahas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views45 pages

Inferential Analysis

Uploaded by

Abdul Rahman Fata Nahas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 45

INFERENTIAL ANALYSES II

(RELATIONSHIPS)

Dr. Abdul Rahman Mahmoud Fata Nahhas

KOP – IIUM
Final Year Research Project
SEM 1
2023-24
CORRELATION ANALYSIS
Introduction
 Correlation measures the strength and direction of a
relationship that exists between two variables

 Partial correlation: Three or more variables are included,

& correlation between two variables is explored, while the
effect of others is removed

 E.G. Correlation between blood pressure and amount of

salt intake after adjustment for the effect of a third variable;
such as amount of fluid intake
Introduction

Example (positive correlation)

Typically, in the summer as the temperature

increases people are thirstier, consuming
more water
Introduction
Water
Temperature Consumption
For seven (C) (Liters)
random summer days, 25 1
a person recorded the
temperature and his
29 1.3
water consumption, 35 1.7
during a three-hour 37 1.9
period spent outside 39 2
41 2.3
44 3.1
Introduction
3.5

3
Water Consumption (L)

2.5

1.5

0.5

0
20 25 30 35 40 45 50

Temperature (C)
Introduction
 Correlation treats all variables equally

 Correlation does not take into consideration

whether a variable has been classified as a
dependent or independent variable
Introduction
 For instance, you might want to find out whether
basketball performance is correlated to a
person's height

 Thus, you’ll plot a graph of performance against

height and calculate the correlation coefficient r

 If - let’s say - r = 0.72, hence, we can conclude

that as height increases so does basketball
performance
Types of correlation
Two main types of Correlation Analysis
Pearson product-moment Spearman's Rank-Order
correlation (Parametric) Correlation (Non-Parametric)

REQUIRES DOES NOT REQUIRE

 A normally distributed Pearson correlation
data assumptions
 A linear relationship
between the two
variables in question
 No heteroscedasticity
Pearson product-moment correlation

 A parametric measure of the strength and direction of a

linear relationship that exists between two continuous
variables

 Denoted by the symbol r

 Attempts to draw a line of best fit through the data of two

variables

 The Pearson correlation coefficient, r, indicates how far

away all these data points are to this line of best fit (i.e.,
how well the data points fit this new line of best fit)
Spearman Rank-order Correlation

 A nonparametric measure of the strength and direction of

relationship that exists between two variables measured on at
least an ordinal scale

 Denoted by the symbol rs (or the Greek letter ρ, pronounced

rho)

 Used for either ordinal variables or for continuous data that

has failed the assumptions necessary for conducting the
Pearson's product-moment correlation
Detecting a linear relationship

 How can you detect a linear relationship

between tested variables?

 Simply by plotting the variables on a graph

(a scatterplot, for example) and visually
inspecting the graph's shape and observe the data
points and their location compared to the line of
best fit
Detecting a linear relationship

3.5

2.5

2 Linear relationship
1.5

0.5

0
5 10 15 20 25 30 35 40 45 50 55
Detecting a linear relationship

3.5

2.5

2 Linear relationship
1.5

0.5

0
0 10 20 30 40 50 60
Detecting a linear relationship

1.2

0.8

0.6 Non-linear relationship

0.4

0.2

0
10 12 14 16 18 20 22
Detecting a linear relationship

1.8

1.6

1.4

1.2

1
Curvilinear relationship
0.8

0.6

0.4

0.2

0
5 10 15 20 25 30
Correlation Coefficient

With the help of Correlation Coefficient, we can

determine:

1. The DIRECTION of the relation →

Positive or Negative

2. The STRENGTH of the relation among the

variables
Direction of Correlation
3.5

Positive
3
Correlation
Water Consumption (L)

2.5

1.5

0.5

0
20 25 30 35 40 45 50

Temperature (C)
Direction of Correlation
6

Negative
5 Correlation

4
Stress Score

0
15 20 25 30 35 40 45 50

Work Performance Score

Strength of Correlation

Strength of Coefficient
Correlation
Positive Negative
Small 0.1 to 0.29 - 0.1 to - 0.29

Medium 0.3 to 0.49 - 0.3 to - 0.49

Large 0.5 to 1 - 0.5 to - 1

Strength of Correlation

 If r (or rs) equals zero, then there is NO

RELATIONSHIP between the two variables

 r = 1 → perfect positive linear relationship

 r = -1 → perfect negative linear relationship

Strength of Correlation
Achieving a value of +1 or -1 means that all your
data points are included on the line of best fit
4.5 4

4
There are no data3.5points that show
3.5
any variation away 3 from this line
3
2.5
2.5
2
2
r = -1 r = +1
1.5
1.5
1
1

0.5 0.5

0 0
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 55
REGRESSION ANALYSIS
Definition

 A predictive statistical method that investigate

the strength of the relationship between TWO
SETS of variables

 It studies the dependence of one or more

variables (dependent variables) on one or more
variables (independent or predictor variables)
Regression Main Purposes
Regression PRIMARILY used to:
1. Estimate (describe) the relationship that exists between
the dependent variable(s) and the explanatory variable(s)
2. Determine the strength of impact of each of the predictor
variables on the dependent variable(s), controlling the
effects of all other predictor variables
3. Predict the value of dependent variable(s) for a given value
of the predictor variable(s)
Regression Equation

 Can be obtained from all types of regression analysis

 Once known, regression equation is used to predict

values of dependent variables, given the values of
independent (predictor) variables

 E.g., if we knew a person's weight, we can then

predict their blood pressure using regression
equation
Regression Equation
E.g., using the simple linear regression model, an equation
obtained can be as the following:

Y = β0 + β1 *X + e
 Typically Y is referred to the dependent variable, &
 X as the independent variable
 β0 is the intercept of the estimated line i.e. the value of Y
when X = 0
 β1 is the gradient of the estimated line [slope of the line] i.e.
the amount by which Y change with one unit change of X
 e is the error term or disturbance in the relationship,
represents factors other than X that affect Y
Types of Regression
Regression analysis is generally classified into two types

Simple Multiple
 Regression involves only  Regression involves more
two variables, one of than two variables,
which is dependent MAINLY, one of which is
variable and the other is dependent variable and the
explanatory (independent) others are explanatory
variable (independent) variables

 The associated model will  The associated model will

be a simple regression be a multiple regression
model model
Types of Regression
Type of dependent variables

Continuous Categorical
Number of Predictor Variables

1 Simple Linear Simple Logistic

>1 Multiple Linear Multiple Logistic

Linear Regression

 Linear Regression establishes a relationship

between dependent (Continuous) variable
(Y) and one or more independent (predictor)
variables (X) using a best fit straight
line (also known as regression line)
Linear Regression
E.g., Predicting patients measured blood glucose level (in
mg/dl) based on dose of insulin infusion (in IU) … SIMPLE
LINEAR REGRESSION

 Presume a sample of 20 DM patients for whom insulin infusion was

administered

 We can plot the values on a graph, with insulin dose on the X axis
and blood glucose on the Y axis

 If there were a perfect linear relationship between insulin dose and

blood glucose, then all 20 points on the graph would fit on a straight
line (But, this is never the case [unless your data are rigged])
Linear Regression
E.g., Predicting patients measured blood glucose level (in
mg/dl) based on dose of insulin infusion (in IU) … SIMPLE
LINEAR REGRESSION

 If there is a (non-perfect) linear relationship between insulin dose

and blood glucose (presumably a negative), then we would get a
cluster of points on the graph which slopes downward

 In other words, as insulin dose is increased; blood glucose level

declines…
Linear Regression
6
Y = β0 + β1 * X + e
5

4
BG = - 7.15 + .095 * Insulin dose
Glucose level

0
10 20 30 40 50 60 70 80

Inslin dose
Linear Regression
 MULTIPLE LINEAR REGRESSION is the same idea as
simple linear regression, except that we have several
independent variables predicting the dependent variable

 To continue with the previous example, assume that we now

want to predict a patient’s BG from insulin dose and gender
as well. In other word, we need to see if gender has also an
impact on the measured BG

 In this case independent variables (Predictors) are Insulin

dose & Gender; while dependent variable is BG
Linear Regression

 Multiple regression tells us the predictive

value of the overall model; all predictor
variables…

 In our example, then, the regression would

tell us how will Insulin dose and Gender
predict a patient’s BG
Linear Regression
DETERMINES THE STRENGTH OF IMPACT OF EACH OF THE PREDICTOR
VARIABLE ON THE DEPENDENT VARIABLE(S), CONTROLLING THE EFFECTS
OF ALL OTHER EXPLANATORY VARIABLES

 Multiple regression ALSO tells us how well each

predictor variable predicts the dependent variable,
controlling for each of the other predictor variables…

 In our example, then, the regression would tell us how

will Insulin dose predicts a patient’s BG, while
controlling for Gender, as well as how will Gender
predict a patient’s BG, while controlling for Insulin
dose
Linear Regression
Assumptions
1. Number of cases: When doing regression, the cases-to-
Independent Variables (IVs) ratio should ideally be 20:1;
that is 20 cases for every IV in the model. The lowest your
ratio should be is 5:1 (i.e., 5 cases for every IV
in the model)

2. Normality: the scores for each variable should be

normally distributed

3. Linearity: There must be linear relationship between

independent and dependent variables
Linear Regression
Assumptions
4. Absence of Multicollinearity: Multicollinearity exists when the
independent variables are highly correlated (r=.9 and above)

5. Absence of Singularity: Singularity occurs when one

independent variable is actually a combination of other
independent variables (e.g. when both subscale scores and the
total score of a scale are included)

6. Outliers: Linear regression is very sensitive to outliers (very

high or very low value on a particular item). Outliers can terribly
affect the regression line and eventually the forecasted values
Logistic Regression

 Used to find the probability of event of

Success and event of Failure

 Used when the dependent variable is binary

(0/ 1, True/ False, Yes/ No) in nature
Logistic Regression
E.g., Predicting if a group of people having depression
or no (depression Yes/No) based, for instance, on
place of residence (Urban/Rural)… SIMPLE
LOGISTIC REGRESSION

 Presume a sample of 50 persons whom depression

was assessed by a psychologist

 Person’s place of residence was reported

Logistic Regression
E.g., Predicting if a group of people having depression or no
(depression Yes/No) based for instance on place of residence
(Urban/Rural)… SIMPLE LOGISTIC REGRESSION

 On a graph, we can plot the result of depression

assessment (Y/N) on the Y axis and the results of
reported place of residence (U/R) on the X axis

 From the graph, we can infer if depression is more

likely to be present among urban or rural persons
Logistic Regression
Assumptions
 Number of cases: When doing regression, the cases-to-
Independent Variables (IVs) ratio should ideally be 20:1; that is
20 cases for every IV in the model. The lowest your ratio
should be is 5:1 (i.e., 5 cases for every IV
in the model)

 Normality: Logistic regression doesn’t require that data to be

normally distributed (non-parametric test)

 Linearity: Logistic regression doesn’t require linear

relationship between dependent and independent variables
Logistic Regression
Assumptions
 Absence of Multicollinearity: Multicollinearity exists when the
independent variables are highly correlated (r=.9 and above)

 Absence of Singularity: Singularity occurs when one

independent variable is actually a combination of other
independent variables (e.g. when both subscale scores and the
total score of a scale are included)

 Outliers: Logistic regression is sensitive to outliers (very high or

very low value on a particular item). Outliers can terribly affect
the regression line and eventually the forecasted values
THANK
YOU!

Correlation and Regression Notes
No ratings yet
Correlation and Regression Notes
5 pages
Correlation and Regression
No ratings yet
Correlation and Regression
82 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
34 pages
Correlation and Regression Analysis - Updated
No ratings yet
Correlation and Regression Analysis - Updated
49 pages
Regression
No ratings yet
Regression
12 pages
Unit 3 - Notes
No ratings yet
Unit 3 - Notes
32 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
25 pages
Biostatistics: Lect6: Correlation and Regression Analysis Dr. Ecem Yeğin
No ratings yet
Biostatistics: Lect6: Correlation and Regression Analysis Dr. Ecem Yeğin
28 pages
Correlation & Regression Analysis Guide
No ratings yet
Correlation & Regression Analysis Guide
49 pages
Presentation Regresion and Correlation
No ratings yet
Presentation Regresion and Correlation
31 pages
Screenshot 2023-12-04 at 11.27.14
No ratings yet
Screenshot 2023-12-04 at 11.27.14
32 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
No ratings yet
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
4 pages
QT 1 UNIT - 4 - Watermarked
No ratings yet
QT 1 UNIT - 4 - Watermarked
10 pages
Intermediate Analytics-Regression-Week 1
No ratings yet
Intermediate Analytics-Regression-Week 1
52 pages
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
No ratings yet
Simple Regression and Simple Correlation: MA261 Statistical and Numerical Techniques March 24, 2022
52 pages
Module 6
No ratings yet
Module 6
35 pages
Chapter 14 Simple Linear Regression .
No ratings yet
Chapter 14 Simple Linear Regression .
39 pages
Corelation With Example
No ratings yet
Corelation With Example
112 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Presentation4 - Bivariate Analysis and Simple Linear Regression
No ratings yet
Presentation4 - Bivariate Analysis and Simple Linear Regression
31 pages
REGRESSION
No ratings yet
REGRESSION
38 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
Regression and Correlation
No ratings yet
Regression and Correlation
20 pages
Correlation Regression Tutorial
No ratings yet
Correlation Regression Tutorial
42 pages
Correlation and Regression 2025
No ratings yet
Correlation and Regression 2025
27 pages
QT - Unit 2 - Part B - Regression
No ratings yet
QT - Unit 2 - Part B - Regression
40 pages
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
No ratings yet
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
69 pages
Correlation and Regression
No ratings yet
Correlation and Regression
15 pages
13simple Linear Regression
No ratings yet
13simple Linear Regression
127 pages
Measuring Relationship Via Regression Analysis and Correlation
No ratings yet
Measuring Relationship Via Regression Analysis and Correlation
9 pages
Business Stats for Students
No ratings yet
Business Stats for Students
66 pages
Bio 6
No ratings yet
Bio 6
26 pages
Presentation 1 Deon Francis George
No ratings yet
Presentation 1 Deon Francis George
12 pages
Statistics for Researchers
No ratings yet
Statistics for Researchers
29 pages
CST8390 Regression
No ratings yet
CST8390 Regression
25 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
27 pages
Correlation
No ratings yet
Correlation
13 pages
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
No ratings yet
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
28 pages
Comprehensive Machine Learning Notes
No ratings yet
Comprehensive Machine Learning Notes
23 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
No ratings yet
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
5 pages
Correlation vs. Regression Explained
No ratings yet
Correlation vs. Regression Explained
3 pages
Simple Linear Regression and Correlation PDF
No ratings yet
Simple Linear Regression and Correlation PDF
7 pages
Topic 2 - Group 5 - Marketing Research Exploratory Research
No ratings yet
Topic 2 - Group 5 - Marketing Research Exploratory Research
50 pages
CH 6
No ratings yet
CH 6
42 pages
Simple and Multiple Regression Analysis
No ratings yet
Simple and Multiple Regression Analysis
46 pages
1.1.2simple Linear Regression
No ratings yet
1.1.2simple Linear Regression
14 pages
Principles of Regression Analysis: Statistics For Researchers
No ratings yet
Principles of Regression Analysis: Statistics For Researchers
5 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
12 pages
MLR and Regression
No ratings yet
MLR and Regression
30 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
RESUME Uji Korelasi Statistik
No ratings yet
RESUME Uji Korelasi Statistik
8 pages
Relationship Between HR Practices and Employee Engagement in Indian Insurance Companies
No ratings yet
Relationship Between HR Practices and Employee Engagement in Indian Insurance Companies
10 pages
Hockey
No ratings yet
Hockey
9 pages
Solutions PDF
No ratings yet
Solutions PDF
122 pages
Understanding Regression Models Basics
No ratings yet
Understanding Regression Models Basics
15 pages
Broadly, There Are 3 Types of Machine Learning Algorithms.
No ratings yet
Broadly, There Are 3 Types of Machine Learning Algorithms.
33 pages
Caffeine Consumption and Self-Assessed Stress, Anxiety, and Depression in Secondary School Children
No ratings yet
Caffeine Consumption and Self-Assessed Stress, Anxiety, and Depression in Secondary School Children
12 pages
Preventive Medicine 4.7 - Case-Control Study
No ratings yet
Preventive Medicine 4.7 - Case-Control Study
5 pages
Egos Score
No ratings yet
Egos Score
16 pages
Customer Churn Prediction On E-Commerce Using Machine Learning
No ratings yet
Customer Churn Prediction On E-Commerce Using Machine Learning
8 pages
Using Stacking Approaches For Machine Learning Models
No ratings yet
Using Stacking Approaches For Machine Learning Models
4 pages
GEE for Longitudinal Binary Data Analysis
No ratings yet
GEE for Longitudinal Binary Data Analysis
13 pages
Statistical Model for Traffic Accidents in Ethiopia
No ratings yet
Statistical Model for Traffic Accidents in Ethiopia
40 pages
Predicting Airline Passengers Satisfaction
100% (7)
Predicting Airline Passengers Satisfaction
70 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
3 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
Predicting Candidate Renege with Logistic Regression
No ratings yet
Predicting Candidate Renege with Logistic Regression
30 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
2A.3 Lecture Slides20 LDV 1
No ratings yet
2A.3 Lecture Slides20 LDV 1
21 pages
Middle East Audit Committees Study
No ratings yet
Middle East Audit Committees Study
8 pages
Logit Model For Binary Data
No ratings yet
Logit Model For Binary Data
50 pages
Female Labor Force Participation and Time-Saving Household Technology: A Case Study of The Microwave From 1978 To 1989
No ratings yet
Female Labor Force Participation and Time-Saving Household Technology: A Case Study of The Microwave From 1978 To 1989
13 pages
DS Assignment No 2
No ratings yet
DS Assignment No 2
21 pages
Summary: Correlation and Regression
No ratings yet
Summary: Correlation and Regression
6 pages
Unit III
No ratings yet
Unit III
24 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
6 pages
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
No ratings yet
RM - Elements of Generalised Linear Models (GLM) and Inference For GLM
11 pages
MVDCMP: Multivariate Decomposition For Nonlinear Response Models
No ratings yet
MVDCMP: Multivariate Decomposition For Nonlinear Response Models
21 pages
Boricha Woreda PSNP Graduation Analysis
No ratings yet
Boricha Woreda PSNP Graduation Analysis
28 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
The Role of Women in Rural Household Food Security 1
No ratings yet
The Role of Women in Rural Household Food Security 1
22 pages

Inferential Analysis

Uploaded by

Inferential Analysis

Uploaded by

INFERENTIAL ANALYSES II

Dr. Abdul Rahman Mahmoud Fata Nahhas

 Partial correlation: Three or more variables are included,

 E.G. Correlation between blood pressure and amount of

Example (positive correlation)

Typically, in the summer as the temperature

 Correlation does not take into consideration

 Thus, you’ll plot a graph of performance against

 If - let’s say - r = 0.72, hence, we can conclude

REQUIRES DOES NOT REQUIRE

 A parametric measure of the strength and direction of a

 Denoted by the symbol r

 Attempts to draw a line of best fit through the data of two

 The Pearson correlation coefficient, r, indicates how far

 A nonparametric measure of the strength and direction of

 Denoted by the symbol rs (or the Greek letter ρ, pronounced

 Used for either ordinal variables or for continuous data that

 How can you detect a linear relationship

 Simply by plotting the variables on a graph

0.6 Non-linear relationship

With the help of Correlation Coefficient, we can

1. The DIRECTION of the relation →

2. The STRENGTH of the relation among the

Work Performance Score

Medium 0.3 to 0.49 - 0.3 to - 0.49

Large 0.5 to 1 - 0.5 to - 1

 If r (or rs) equals zero, then there is NO

 r = 1 → perfect positive linear relationship

 r = -1 → perfect negative linear relationship

 A predictive statistical method that investigate

 It studies the dependence of one or more

 Can be obtained from all types of regression analysis

 Once known, regression equation is used to predict

 E.g., if we knew a person's weight, we can then

 The associated model will  The associated model will

1 Simple Linear Simple Logistic

>1 Multiple Linear Multiple Logistic

 Linear Regression establishes a relationship

 Presume a sample of 20 DM patients for whom insulin infusion was

 If there were a perfect linear relationship between insulin dose and

 If there is a (non-perfect) linear relationship between insulin dose

 In other words, as insulin dose is increased; blood glucose level

 To continue with the previous example, assume that we now

 In this case independent variables (Predictors) are Insulin

 Multiple regression tells us the predictive

 In our example, then, the regression would

 Multiple regression ALSO tells us how well each

 In our example, then, the regression would tell us how

2. Normality: the scores for each variable should be

3. Linearity: There must be linear relationship between

5. Absence of Singularity: Singularity occurs when one

6. Outliers: Linear regression is very sensitive to outliers (very

 Used to find the probability of event of

 Used when the dependent variable is binary

 Presume a sample of 50 persons whom depression

 Person’s place of residence was reported

 On a graph, we can plot the result of depression

 From the graph, we can infer if depression is more

 Normality: Logistic regression doesn’t require that data to be

 Linearity: Logistic regression doesn’t require linear

 Absence of Singularity: Singularity occurs when one

 Outliers: Logistic regression is sensitive to outliers (very high or

You might also like