0% found this document useful (0 votes)

25 views27 pages

Regression Analysis

Uploaded by

vdalal83

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views27 pages

Regression Analysis

Uploaded by

vdalal83

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Regression Analysis

Foundation Skills Academy

Index

1. Introduction to Regression Analysis

2. Types of Regression Analysis

3. Linear Regression Model

4. Example: Simple Linear Regression

5. Example: Multiple Linear Regression

6. Logistic Regression Model

7. Example: Logistic Regression

Introduction to Regression Analysis
Regression analysis is a statistical method for understanding and quantifying the relationship between two or more variables.
It helps a business estimate one dependent variable based on the values of one or more independent variables.

Dependent Variable: The dependent variable is essentially the "outcome" you’re trying to understand or predict. It’s the
focus of your study, whether you’re looking at quarterly sales figures, customer satisfaction ratings, or any other key result.

Independent Variable: Independent variables are the "factors" that might influence or cause changes in the dependent
variable. These are the variables you manipulate or observe to see their impact on your outcome of interest. For example, if
you adjust the price of a product, that price change is an independent variable that could affect sales figures.

Output of Regression Analysis

n
Data Analysis – Types of Regression Analysis

Simple Linear regression: Simple linear regression is used when a single independent variable predicts a dependent
variable. The linear regression formula is represented as Y = a + bX where, Y is the dependent var. X is the independent
var. a is the intercept (value of Y when X = 0). b is the slope, also called as coefficient (change in Y for a unit change in X).

Business Application: It's frequently used to identify how a change in one variable will affect another. For example, predicting
sales based on advertising expenditure or estimating employee productivity based on hours worked.

Multiple Linear regression: Multiple regression extends linear regression by considering multiple independent variables to
predict the dependent variable. The relationship is represented as Y = a + b₁X₁ + b₂X₂ + ... + bₙXₙ

Business Application: Businesses use it to understand how multiple factors influence outcomes. For instance, predicting
home prices based on features like square footage, number of bedrooms, and neighborhood.

Non-Linear regression: It is used in cases where the relationship between the dependent and independent variables is
nonlinear. The model can take various forms depending on the specific problem. It is generally represented as Y = f(X, θ)
where θ represents the parameters of the nonlinear function f.
Data Analysis – Types of Regression Analysis

Examples of Nonlinear regression,

Logistic Regression: Logistic regression is used when the dependent variable is binary (two possible outcomes) or
categorical. It models the probability of a particular outcome occurring.

Business Application: In business, logistic regression is employed for tasks like predicting customer churn (yes/no), whether
a customer will purchase a product (yes/no), or whether a loan applicant will default on a loan (yes/no).

Polynomial Regression: Polynomial regression is used when the relationship between the independent and dependent
variables follows a polynomial curve and is not linear.

Business Application: It can be used to model more complex relationships in data, such as predicting the growth of a plant-
based on time and other environmental factors.

Exponential Regression: Exponential regression is a type of nonlinear regression that fits an exponential function to the
data. The general form of an exponential regression model is:

Power Regression: Power regression is a type of nonlinear regression that fits a power function to the data. The general
form of a power regression model is:
Data Analysis – Importance of Regression Analysis

Predictive Modeling: Regression analysis is commonly used for predictive modeling. By examining historical data and
identifying relationships between variables, businesses can make informed predictions about sales, demand, etc.

Identifying Key Drivers: Regression analysis can help identify which independent variables significantly impact the
dependent variable. For e.g., it can determine which marketing channels or advertising strategies influence sales most

Optimizing Decision Making: Whether it's optimizing pricing strategies, production processes, or marketing campaigns,
regression can help companies allocate resources efficiently and achieve better outcomes.

Risk Assessment: Businesses are exposed to various risks, such as economic fluctuations, market changes, and
competitive pressures. Regression analysis-powered risk assessment techniques can be used to assess how changes in
independent variables may affect business performance.

Performance Evaluation: Regression analysis can evaluate the effectiveness of different initiatives and strategies. For
instance, it can assess the impact of employee training on productivity or the relationship between customer satisfaction
and repeat purchases.

Market Research: In market research, regression analysis can be used to understand consumer behavior and
preferences. By examining demographics, pricing, and product features, businesses can tailor their products and
marketing efforts to specific target audiences.
How to Perform Regression Analysis?

Data collection and preparation: Gather and clean data, ensuring it meets assumptions like linearity and independence.

Appropriate regression model: Choose the correct type of regression (linear, polynomial etc.) based on data and objective

Data analysis and interpretation: Test regression assumptions, assess model accuracy, and interpret coefficients

Model evaluation and validation: Test model's performance using metrics like R-squared, mean-squared error.

• p values and coefficients in regression analysis work together to tell which relationships in the model are statistically
significant and the nature of those relationships.

• The linear regression coefficients describe the mathematical relationship between each independent variable and the
dependent variable. The p values for the coefficients indicate whether these relationships are statistically significant.

• After fitting a regression model, check the residual plots to be sure that you have unbiased estimates.

• R-squared is a goodness-of-fit measure for linear regression models. This statistic indicates the percentage of the
variance in the dependent variable that the independent variables explain collectively. R-squared measures the strength
of the relationship between your model and the dependent variable on a 0 – 100% scale. For example, an R-squared of
60% reveals that 60% of the variability observed in the target variable is explained by the regression model. Generally, a
higher R-squared indicates more variability is explained by the model.

Using software tools: Use Python or R to perform regression analysis efficiently.

Assumptions of Linear Regression Analysis

Linearity: The relationship between the independent and dependent variables is linear.

Sample representativeness: The sample is representative of the population.

Normally distributed errors: The errors are normally distributed.

Homoscedasticity: The variance of the errors (residuals) remains constant across all levels of the independent
variable(s). Put simply, it signifies that the dispersion of residuals stays consistent, enhancing the accuracy and legitimacy
of regression predictions.

No Multicollinearity : When independent variables are highly correlated, it becomes challenging to determine their
impact on the dependent variable.

No outliers: There are no outliers in the data.

Simple Linear Regression Example

You are a social researcher interested in the relationship between income and happiness. You survey 500 people whose
incomes range from $15k to $75k and ask them to rank their happiness on a scale from 1 to 10.

Your independent variable (income) and dependent variable (happiness) are both quantitative, so you can do a regression
analysis to see if there is a linear relationship between them.

R code for simple linear regression:

[Link] <- lm(happiness ~ income, data = [Link])

This code takes the data you have collected (data = [Link]) and calculates the effect that the independent variable
income has on the dependent variable happiness using the equation for the linear model: lm().

To view the results of the model, you can use the summary() function in R:

summary([Link])

Note: In linear regression, while the dependent variable must be continuous (e.g. age, weight, temperature), the independent
variables can be either continuous or categorical (e.g., gender, city, type of product) (after encoding them as dummy variables).
Simple Linear Regression Example

Results of the Model:

This output table first repeats the formula

that was used to generate the results
(‘Call’), then summarizes the model
residuals (‘Residuals’), which give an idea
of how well the model fits the real data.

Next is the ‘Coefficients’ table. The first

row gives the estimates of the y-intercept,
and the second row gives the regression
coefficient of the model.

happiness = 0.20 + 0.71*income ± 0.018

The number in the table (0.713) tells us

that for every one unit increase in income
(where one unit of income = $10,000)
there is a corresponding 0.71 unit
increase in reported happiness (where
happiness is a scale of 1 to 10)
Simple Linear Regression Example

Results of the Model:

The Std. Error column shows how much
variation there is in our estimate of the
relationship between income and
happiness.

The t value column displays the test

statistic. The larger the test statistic, the
less likely it is that our results occurred by
chance.

The Pr(>| t |) column shows the p value.

The p-value indicates whether the
independent variable has a significant
influence. p-values smaller than 0.05 (or
sometimes 0.001) are considered as
significant.

Because the p value is so low (p < 0.001),

we can conclude that income has a
statistically significant effect on
happiness.
Simple Linear Regression Example
Homoscedasticity - Residual Plots
A residual is a measure of how far away a point is vertically from the regression line. Simply, it is the error between a
predicted value and the observed actual value.

The most important assumption

of a linear regression model is
that the errors are independent
and normally distributed.

A few characteristics of a good

residual plot are as follows:

• It has a high density of

points close to the origin and
a low density of points away
from the origin
• It is symmetric about the
origin
Multiple Linear Regression Analysis Example - Marketing Mix Modeling

Market Mix Modeling (MMM) is a technique which helps in quantifying the impact of several marketing inputs on Sales or
Market Share. The purpose of using MMM is to understand how much each marketing input contributes to sales, and how
much to spend on each marketing input. Specifically, here are some ways MMM helps businesses thrive,

Optimizing marketing spending helps businesses understand what marketing activities contribute most effectively to
achieving business objectives.

Budget allocation After analyzing the ROI of various marketing channels and tactics, businesses can make more
informed decisions about where to allocate their marketing budget with the greatest yield

Forecasting and planning Businesses can simulate the impact of changes in marketing strategies or external factors and
use these insights to anticipate the potential outcomes and adjust their plans accordingly

Understanding customer behavior helps businesses understand how different customer segments respond to various
marketing stimuli, enabling more targeted and effective marketing strategies.

Continuous improvement Monitoring key performance metrics and analyzing trends enables businesses to identify
opportunities for optimization, test new strategies, and adapt to changing market conditions, ensuring that their marketing
efforts remain effective and competitive.
Marketing Mix Modeling

This is a more representative setting as simple linear regression is hardly used in real life MMM projects; as it is too
simplistic and does not handle the complexity of consumer behavior and the media landscape.

In a typical marketing mix modeling project, multiple variables impact the sales performance. To be able to measure the
impact of those variables on sales or any other chosen KPI, the analyst needs to build a robust model which accounts for
all the variables influencing the movement of sales.

• x1,x2,...,xk are the independent variables influencing sales.

• The term βX represents the contribution of the variable X on sales: i.e. how much sales are driven
by the variable X (incremental impact)
Marketing Mix Modeling – Contribution Chart

A contribution chart visually represents different marketing tactics’ impact on sales. It shows how much each tactic
contributes to the total sales and highlights the most effective ones.

The chart can be used to benchmark performance, compare campaigns over time, and plan for future initiatives.
Contribution charts provide an easy-to-understand overview of where a marketer’s efforts should be focused to maximize
ROI and optimize campaign performance.
Logistic Regression
Types of Logistic Regression
Binary Logistic Regression: Binary logistic regression is used to predict the probability of a binary outcome, such as yes or
no, true or false, or 0 or 1. For example, it could be used to predict whether a customer will churn or not, whether a patient
has a disease or not, or whether a loan will be repaid or not.
Multinomial Logistic Regression: Multinomial logistic regression is used to predict the probability of one of three or more
possible outcomes, such as the type of product a customer will buy, the rating a customer will give a product, or the political
party a person will vote for.
Ordinal Logistic Regression: It is used to predict the probability of an outcome that falls into a predetermined order, such as
the level of customer satisfaction, the severity of a disease, or the stage of cancer.

How to Perform Logistic Regression Analysis?

Prepare the data: The data should be in a format where each row represents a single observation and each column
represents a different variable. The target variable (the variable you want to predict) should be binary (yes/no, true/false, 0/1).
Train the model: We teach the model by showing it the training data. This involves finding the values of the model
parameters that minimize the error in the training data.
Evaluate the model: The model is evaluated on the test data to assess its performance on unseen data.
Use the model to make predictions: After the model has been trained and assessed, it can be used to forecast outcomes
on new data.
Logistic Regression

• In medicine, a frequent application is to find out which variables have an influence on a disease. In this case, 0 could
stand for not diseased and 1 for diseased. Subsequently, the influence of age, gender and smoking status (smoker or not)
on this particular disease could be examined.

• In linear regression, the independent variables (e.g., age and gender) are used to estimate the specific value of the
dependent variable (e.g., body weight).

• In logistic regression, on the other hand, the dependent variable is dichotomous (0 or 1) and the probability that
expression 1 occurs is estimated. Returning to the example above, this means: How likely is it that the disease is present
if the person under consideration has a certain age, sex and smoking status.

To build a logistic regression model, the linear regression equation is used as the starting point.
Logistic Regression

However, if a linear regression were simply calculated for solving a logistic regression, the following result would appear
graphically:

As can be seen in the graph, values between plus and minus infinity can now occur. The goal of logistic regression,
however, is to estimate the probability of occurrence and not the value of the variable itself. Therefore, the equation must be
transformed.

To do this, it is necessary to restrict the value range for the prediction to the range between 0 and 1. To ensure that only
values between 0 and 1 are possible, the logistic function is used.
Logistic Regression

Logistic Function

The logistic model is based on the logical function. The special thing about the logistic function is that for values between
minus and plus infinity, it always assumes only values between 0 and 1.

To calculate the probability of a person being sick or not using the logistic regression for the example above, the model
parameters b1, b2, b3 and a must first be determined. Once these have been determined, the equation will be:
Key properties of the Logistic Regression equation

Sigmoid Function: The logistic regression model, when explained, uses a special “S” shaped curve to predict
probabilities. It ensures that the predicted probabilities stay between 0 and 1, which makes sense for probabilities.

Coefficients: These are just numbers that tell us how much each input affects the outcome in the logistic regression
model. For example, if age is a predictor, the coefficient tells us how much the outcome changes for every one-year
increase in age.

Best Guess: We figure out the best coefficients for the logistic regression model by looking at the data we have and
tweaking them until our predictions match the real outcomes as closely as possible.

Basic Assumptions: We assume that our observations are independent, meaning one doesn’t affect the other. We also
assume that there’s not too much overlap between our predictors (like age and height)

Linearity in the Logit: The relationship between the independent variables and the logit of the dependent variable (ln(p /
(1-p))) is assumed to be linear. This doesn’t necessarily mean the outcome itself has a linear relationship with the
independent variables, but the log-odds do

Probabilities, Not Certainties: Instead of saying “yes” or “no” directly, logistic regression gives us probabilities, like
saying there’s a 70% chance it’s a “yes” in the logistic regression model. We can then decide on a cutoff point to make our
final decision.

Checking Our Work: We have some tools to make sure our predictions are good, like accuracy, precision, recall, and a
curve called the ROC curve.
Logistic Regression Example

You have a dataset, and you need to predict whether a candidate will get admission in the desired college or not, based on
the person’s GRE score, GPA and College Rank.

Steps:

1. In the dataset, we are given the GRE scores, GPAs and college ranks for several students, but it also has a column that
indicates whether those students were admitted or not.

2. Based on this labeled data, you can train the model, validate it, and then use it to predict the admission for any GRE,
GPA and college rank.

3. Once you split the data into training and test sets, you will apply the regression on the three independent variables (GRE,
GPA and Rank), generate the model, and then run the test set through the model.

4. Once that is complete, you will validate the model to see how well it performed.

Data Set
Logistic Regression Example

Model & Results Interpretation

1- Each one-unit change in gre will increase the log odds of
getting admit by 0.002, and its p-value indicates that it is
somewhat significant in determining the admit.

2- Each unit increase in GPA increases the log odds of getting

admit by 0.80 and p-value indicates that it is somewhat
significant in determining the admit.

3- The interpretation of rank is different from others, going to

rank-2 college from rank-1 college will decrease the log odds
of getting admit by -0.67. Going from rank-2 to rank-3 will
decrease it by -1.340.

4- The difference between Null deviance and Residual

deviance tells us that the model is a good fit. Greater the
difference better the model. Null deviance is the value when
you only have intercept in your equation with no variables.
The null deviance tells us how well the response variable can
be predicted by a model with only an intercept term. Residual
deviance is the value when you are taking all the variables
into account.
*When using logistic regression, you should convert a rank from an integer to a factor to indicate that the rank is a categorical variable.
Logistic Regression Example

Prediction

Let’s say a student have a profile with 790 in GRE,3.8 GPA and he studied from a rank-1 college. Now you want to predict
the chances of that boy getting admit in future.

We see that there is 85% chance that this guy will get the admit.
Terminologies used

A confusion matrix measures the performance and accuracy of machine learning

classification models. It gives a breakdown of the predictions made by a model compared
to the actual outcomes.

Accuracy score

Accuracy is the percentage of cases that the model predicted correctly. This is a very high-
level summary; we need more information to evaluate the classifier properly.

Precision (or positive predicted value)

Precision is the ratio of correct positive predictions out of all positive predictions (both
correct and incorrect). If we have high precision, then we minimize false positives.

Recall (sensitivity or true positive rate)

Recall is the ratio of correct positive predictions out of all positive cases. High recall means
that false negatives are minimized.

Specificity (true negative rate)

Specificity is the ratio of correct negative predictions out of all cases that are actually
negative.
ROC Curve

• Receiver Operating Characteristic (ROC) curves are graphical

representations of how the model can tell classes apart at different
decision thresholds. This gives a good overview of a model’s
performance across various thresholds, helping to understand the
trade-offs between TPR and FPR.

• It plots the true positive rate (sensitivity or recall) against the false
positive rate (1 – specificity) at various classification thresholds.
This gives a good overview of a model’s performance across
various thresholds, helping to understand the trade-offs between
TPR and FPR.

• We can also calculate the Area Under the ROC Curve (AUC) for a
single measure of the model’s overall performance. Higher AUC =
better model performance.

• The diagonal line represents random guessing; any curve above it

indicates better-than-random performance. The closer the curve is
to the top-left corner, the higher the model’s performance.
Foundation Skills Academy

Thank You

Regression Analysis for Training Events
No ratings yet
Regression Analysis for Training Events
18 pages
JOY Das
No ratings yet
JOY Das
10 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
13 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
4 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Ssdma Unit 2 Part1
No ratings yet
Ssdma Unit 2 Part1
20 pages
Regression Analysis Explained
No ratings yet
Regression Analysis Explained
4 pages
Aimen Roll No. 002-1
No ratings yet
Aimen Roll No. 002-1
12 pages
Unit III
No ratings yet
Unit III
13 pages
Regression Notes-I
No ratings yet
Regression Notes-I
10 pages
Module 8 Regression Analysis
No ratings yet
Module 8 Regression Analysis
15 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
25 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
Slides
No ratings yet
Slides
39 pages
Unit - II - DA
No ratings yet
Unit - II - DA
22 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
Business Analytics Regression Guide
No ratings yet
Business Analytics Regression Guide
91 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Understanding Regression Analysis
No ratings yet
Understanding Regression Analysis
7 pages
Unit-2-Linear Regression-R1
No ratings yet
Unit-2-Linear Regression-R1
21 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
3 pages
Understanding Regression Variables
No ratings yet
Understanding Regression Variables
3 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Regression Analysis Linear and Multiple Regression
No ratings yet
Regression Analysis Linear and Multiple Regression
6 pages
Regression for Business Insights
No ratings yet
Regression for Business Insights
3 pages
What Is Regression Analysis
No ratings yet
What Is Regression Analysis
3 pages
Simple Liner REgression
No ratings yet
Simple Liner REgression
27 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
Data-Driven Regression Analysis Guide
No ratings yet
Data-Driven Regression Analysis Guide
27 pages
Regression Anslysis
No ratings yet
Regression Anslysis
23 pages
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
No ratings yet
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
12 pages
Module 3
No ratings yet
Module 3
34 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
MGT782 - Ia2 - Azura MD Radzi
No ratings yet
MGT782 - Ia2 - Azura MD Radzi
15 pages
Notes 2
No ratings yet
Notes 2
22 pages
Day7-Linear Regression New
No ratings yet
Day7-Linear Regression New
26 pages
Ida Unit-3
No ratings yet
Ida Unit-3
34 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
4 ML
No ratings yet
4 ML
41 pages
Forecasting Models & Regression Analysis
No ratings yet
Forecasting Models & Regression Analysis
13 pages
Introduction To Regression Analysis
No ratings yet
Introduction To Regression Analysis
13 pages
Regression Guide for Supporting Characters
100% (1)
Regression Guide for Supporting Characters
21 pages
Meweek 3
No ratings yet
Meweek 3
57 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Multiple Regression
100% (1)
Multiple Regression
29 pages
Unit 2
No ratings yet
Unit 2
26 pages
Bsem 34 Chapter 5 Regression Analysis
No ratings yet
Bsem 34 Chapter 5 Regression Analysis
14 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
14 pages
Assignment Group C
No ratings yet
Assignment Group C
8 pages
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
No ratings yet
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
16 pages
Business Insights with Regression
No ratings yet
Business Insights with Regression
4 pages
Assignment On Regression
100% (1)
Assignment On Regression
11 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
3 pages
Socioeconomic Factors and Gingival Bleeding
No ratings yet
Socioeconomic Factors and Gingival Bleeding
9 pages
Brain Electrical Oscillations in Memory
No ratings yet
Brain Electrical Oscillations in Memory
45 pages
PR 1 q3 Group 2 11 Trustworthy
No ratings yet
PR 1 q3 Group 2 11 Trustworthy
20 pages
The Internet and Local Wages A Puzzle
No ratings yet
The Internet and Local Wages A Puzzle
31 pages
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
No ratings yet
Tabular Data Classification and Regression XGBoost or Deep Learning With Retrieval-Augmented Generation
14 pages
The Effects of Environmental Awareness and Corporate Social
No ratings yet
The Effects of Environmental Awareness and Corporate Social
13 pages
Legal Research Proposal
No ratings yet
Legal Research Proposal
29 pages
Assignment 1
No ratings yet
Assignment 1
14 pages
ABS Part 2 Materials and Welding
No ratings yet
ABS Part 2 Materials and Welding
2 pages
HRMN 300 Assignment 2 Guidelines
No ratings yet
HRMN 300 Assignment 2 Guidelines
3 pages
Book Review or Article Critique
100% (2)
Book Review or Article Critique
61 pages
PR1 Module 2
No ratings yet
PR1 Module 2
3 pages
Employee Surveys in NTPC
No ratings yet
Employee Surveys in NTPC
31 pages
Portuguese Scale of Body Connection Validation
No ratings yet
Portuguese Scale of Body Connection Validation
12 pages
Overview of Quantitative Research Methods
No ratings yet
Overview of Quantitative Research Methods
56 pages
Transcribing Data in Qualitative Research
No ratings yet
Transcribing Data in Qualitative Research
22 pages
Confirmatory Factor Analysis Using AMOS
No ratings yet
Confirmatory Factor Analysis Using AMOS
14 pages
A Study of Factors Influencing The Selection of A Higher Education Institution
No ratings yet
A Study of Factors Influencing The Selection of A Higher Education Institution
14 pages
GRADE 10 BUSINESS STUDIES - TERM 3 NOTES - Edited
No ratings yet
GRADE 10 BUSINESS STUDIES - TERM 3 NOTES - Edited
39 pages
Customer Satisfaction at Gorkha Store
No ratings yet
Customer Satisfaction at Gorkha Store
39 pages
Travel Services Perform Workplace and Safety Practices (PWSP)
No ratings yet
Travel Services Perform Workplace and Safety Practices (PWSP)
23 pages
Quality Adherence in Pre-Analytical Phase
No ratings yet
Quality Adherence in Pre-Analytical Phase
9 pages
Like Tag and Share Bolstering Social Media Marketing To Improve Intention To Visit
No ratings yet
Like Tag and Share Bolstering Social Media Marketing To Improve Intention To Visit
20 pages
Research Ethics for Students
No ratings yet
Research Ethics for Students
9 pages
UVigo CIRCOM Chair EXECUTIVE REPORT
No ratings yet
UVigo CIRCOM Chair EXECUTIVE REPORT
61 pages
The Students' Ability To Identify Nominal and Verbal Sentences in English of Grade Viii
No ratings yet
The Students' Ability To Identify Nominal and Verbal Sentences in English of Grade Viii
6 pages
PR 2 Worksheet Week1
No ratings yet
PR 2 Worksheet Week1
3 pages
Dry Sieve Report
93% (15)
Dry Sieve Report
11 pages
School Inclusion: A Descriptive Analysis
No ratings yet
School Inclusion: A Descriptive Analysis
13 pages
Mirjam Van Praag, Peter, H. Versloot - The Economic Benefits and Costs of Entrepreneurship (Foundations and Trends in Entrepreneurship) (2007)
No ratings yet
Mirjam Van Praag, Peter, H. Versloot - The Economic Benefits and Costs of Entrepreneurship (Foundations and Trends in Entrepreneurship) (2007)
105 pages

Regression Analysis

Uploaded by

Regression Analysis

Uploaded by

Regression Analysis

Foundation Skills Academy

1. Introduction to Regression Analysis

2. Types of Regression Analysis

3. Linear Regression Model

4. Example: Simple Linear Regression

5. Example: Multiple Linear Regression

6. Logistic Regression Model

7. Example: Logistic Regression

Output of Regression Analysis

Examples of Nonlinear regression,

Using software tools: Use Python or R to perform regression analysis efficiently.

Sample representativeness: The sample is representative of the population.

Normally distributed errors: The errors are normally distributed.

No outliers: There are no outliers in the data.

R code for simple linear regression:

[Link] <- lm(happiness ~ income, data = [Link])

Results of the Model:

This output table first repeats the formula

Next is the ‘Coefficients’ table. The first

happiness = 0.20 + 0.71*income ± 0.018

The number in the table (0.713) tells us

Results of the Model:

The t value column displays the test

The Pr(>| t |) column shows the p value.

Because the p value is so low (p < 0.001),

The most important assumption

A few characteristics of a good

• It has a high density of

• x1​,x2​,...,xk are the independent variables influencing sales.

How to Perform Logistic Regression Analysis?

Model & Results Interpretation

2- Each unit increase in GPA increases the log odds of getting

3- The interpretation of rank is different from others, going to

4- The difference between Null deviance and Residual

A confusion matrix measures the performance and accuracy of machine learning

Precision (or positive predicted value)

Recall (sensitivity or true positive rate)

Specificity (true negative rate)

• Receiver Operating Characteristic (ROC) curves are graphical

• The diagonal line represents random guessing; any curve above it

You might also like

• x1,x2,...,xk are the independent variables influencing sales.