0% found this document useful (0 votes)

140 views49 pages

Logistic Regression Insights

Classification problems involve predicting discrete class outcomes and are common in analytics. Logistic regression is widely used for classification problems to predict the probability of class membership based on explanatory variables. It is well-suited for binary and multi-class dependent variables and makes fewer assumptions than other techniques like linear regression. Examples of classification problems include customer profiling, credit risk assessment, and fraud detection.

Uploaded by

rakesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views49 pages

Logistic Regression Insights

Uploaded by

rakesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

CLASSIFICATION PROBLEMS

 Classification problems are an important category

of problems in analytics in which the response
variable (Y) takes a discrete value.

 The primary objective is to predict the class of a

customer (or class probability) based on the values
of explanatory variables or predictors.
Classification Problems

 Examples of Classification Problems:

 Customer profiling (customer segmentation)

 Customer Churn
 Credit Classification (low, high and medium risk)
 Employee attrition.
 Fraud (classification of transaction to fraud/no-fraud)
 Stress levels
 Text Classification (Sentiment Analysis)
Types of Classification Techniques

 Logistics Regression
 Discriminate analysis
 Support Vector Machine
 Naïve Bayes
 Stochastic Gradient Descent
 Decision Tree
 Random Forest
Types of Classification Techniques
Why do we need logistics regression
whenwe have linear regression

 Think about an important metric in

marketing: customer retention.
 If Keepmoney Bank wants to use a
regression analysis to examine whether it
will retain a customer, it will set retention as
its dependent variable.
Why do we need logistics regression
when we have linear regression

 . Rather than being normally distributed in a

bell curve in the manner of continuous
variables, however, a 1 will be assigned to
represent customer retention and a 0 will
represent customer loss. Only those two
outcomes are possible.
 This is a situation, wherein what you are
trying to predict is one of two options.
Why do we need logistics regression
when we have linear regression
 But why can’t Keepmoney use trusted linear
regression to determine the likelihood of
customer retention given a set of
independent variables?
 Linear regressions assume a bell-curve
distribution of outcomes (what is known as a
normal distribution) from negative infinity to
infinity. variable such as customer retention,
there is no curve across a range of
outcomes. The outcome can only be 1 or 0.
Why do we need logistics regression
when we have linear regression

 If Keepmoney attempts to use a linear

regression to examine customer retention,
nonsensical predictions may result. The
bank may find its chances of customer
retention are greater than 1, meaning it has
even better than a 100% chance of
retaining a customer. Or the bank may find
its chances are less than 0.
Customer Choice Behaviour

 Logistics regression is used to to represent consumers’

choice behavior as accurately as possible.

 When individual consumers choose products, the value

they place on the product does not typically increase
linearly with increases in a preferred feature of the
product.

 Instead, research indicates consumer valuation of a

product typically follows an S shaped curve with increases
in the levels of a preferred attribute.
S Shaped Curve
S Shaped Curve
• Imagine that on the x-axis we have the level of discount on a
INR 10000 plane ticket from Bangalore to New Delhi.

• Ask a group of your friends how many of them would

purchase the flight. Then offer a discount of INR 500. How
many additional people said they would buy the ticket?
Probably not many.

• Increase the discount to 1000 . Maybe one person half

heartedly jumps in.

• At 3000- 4000 discount, you would , you are likely to see a

spike in purchasers.

• After that the number of additional purchasers will taper off,

as you have reached the upper threshold.
Logistic Regression: Regression
with a Binary Dependent Variable

12
Logistic Regression Defined
Logistic Regression . . . is a specialized
form of regression that is designed to predict
and explain a binary (two-group) categorical
variable rather than a metric dependent
measure.

It is less affected than discriminant analysis

when the basic assumptions, particularly
normality of the independent variables, are
not met.

13
Logistic regression is best suited to address
two research objectives . . .
• Identifying the independent variables that
impact group membership in the dependent
variable.
• Establishing a classification system based on
the logistic model for determining group
membership.

14
Why Logistic Regression, not
linear Regression
• The binary nature of the dependent variable (0 – 1)
means the error term has a binomial distribution
instead of a normal distribution, and it thus invalidates
all testing based on the assumption of normality.
• The variance of the dichotomous variable is not
constant, creating instances of heteroscedasticity as
well.
• Neither of the above violations can be remedied
through transformations of the dependent or
independent variables. Logistic regression was
developed to specifically deal with these issues.

15
Limited Assumptions in
Logistic Regression

• The advantages of logistic regression are

primarily the result of the general lack of
assumptions.
• Logistic regression does not require any specific
distributional form for the independent variables.
• Linear relationships between the dependent and
independent variables are not required.

16
Logistic Regression - Introduction

 The name logistic regression emerges from logistic

distribution function.

eZ
1  eZ

 Mathematically, logistic regression attempts to estimate

conditional probability of an event (or class probability).
Logistic Regression

 Logistic regression models estimate how

probability of an event may be affected by
one or more explanatory variables.

 Logistic regression is a technique used for

predicting “class probability”, that is the
probability that the case belongs to a
particular class.
Binomial & Multinomial Logistic Regression

 Binomial (or binary) logistic regression is a

model in which the dependent variable is
dichotomous.

 In multinomial logistic regression model, the

dependent variable can take more than two
values.

 The independent variables may be of any type.

Mathematics of Logistic Regression
Concept of Odd

 Probabilities are simply the likelihood that

something would happen.

 A probability of .2 rain means that there is a

20% chance of rain.
Mathematics of Logistic Regression
 Odds are the ratio of the probability that an
event will occur divided by the probability
that an event will not occur.

 If there is a 20% chance of rain, there is an

80% chance of no rain.

 Odds = Prob(rain)/Prob (no rain) = .2/.8 =

.25
Mathematics of Logistic Regression
 Unlike probability, odds can take any value.

 An 80% chance of rain has odds of .8/.2 = 4

 ODDS RATIO: Ratio of two odds.

Logit Function

 The logit function is defined as the natural

logarithm of odds.

 Logit of a variable (probability)  (with value

between 0 and 1) is given by:


Logit ( )  ln( )   0  1 x1
1 

If there is a 20% chance of rain, then there is a logit of ln(.25) = -1.386

Logistic Transformation

• The logistic regression model is given by:

 i 
ln    0  1 X i
1 i 
Function w ith linear properties (Link Function)

i
 e(  0  1 X i )

1 i

e(  0  1 X i )
i 
1  e(  0  1 X i )

Business Analytics – The Science of Data Driven Decision Making

Estimation of Logistic Regression Model
and Assessing Overall Fit

• Transforming the dependent variable

• Estimating the coefficients
• Transforming a probability into odds and
logit values
• Model estimation
• Assessing the goodness of fit

25
Transforming a Probability into
Odds and Logit Values

o The logistic transformation has two basic steps:

 Restating a probability as odds, and
 Calculating the logit values.
o Instead of using ordinary least squares to
estimate the model, the maximum likelihood
method is used.
o The basic measure of how well the maximum
likelihood estimation procedure fits is the
likelihood value.
26
Logistics Regression Output

6-27
Test for significance of the coefficients

-We use hypothesis testing to see if the coefficient is

significantly different from 0, it has any impact or not.
- Like t value in Linear regression, here we use Wald
statistics
- Gre and shopping donot impact the probability of
getting admission.

6-28
Wald’s test
Wald’s test is used for checking statistical
significance of individual predictor variables
(equivalent to t-test in MLR model). The null and
alternative hypotheses for Wald’s test are:
H0: i = 0
H1: i  0

Wald’s test statistic is given by

2
  
 i 
W 
 
 S e (  i ) 
Interpretation of the coefficients

-This model develops coefficient for independent

variables similar to linear regression.

- But the interpretations are different

-Here the equation is

Log (P= 1/ P=0) = -4.087909+ .827991*gpa – (

.13602527*( 1- if from tier 3 institute, else 0) -1.500575
(1- if from tier 4 institute, else 0)

- The coefficients B0, B1 --- are actually measures of

the change in the ratio of the probabilities (Odd) 6-30
Directionality of the Relationship
A positive relationship means an increase in the
independent variable is associated with an increase in the
predicted probability, and vice versa. But the direction of the
relationship is reflected differently for the original and
exponentiated logistic coefficients.
• Original coefficient signs indicate the direction of the
relationship.
• Exponentiated coefficients are interpreted differently
since they are the logarithms of the original coefficients
and do not have negative values. Thus, exponentiated
coefficients above 1.0 represent a positive relationship
and values less than 1.0 represent negative
relationships.
31
Magnitude of the Relationship . . .

The magnitude of metric independent variables is

interpreted differently for original and exponentiated
logistic coefficients:
• Original logistic coefficients – are less useful in
determining the magnitude of the relationship since the
reflect the change in the logit (logged odds) value.
• For every one unit increase in Brand attitude score, we
expect a 1.274 increase in the log odd of Brand Loyalty
in the positive direction.

32
Magnitude of the Relationship . . .
Exponentiated coefficients – directly reflect the
magnitude of the change in the odds value.
- An exponentiated coefficient of 1.0 denotes no change
(1.0 times the independent variable = no change).
- The exponentiated coefficient minus 1.0 equals the
percentage change in the odds.

- An exponentiated coefficient of .2 indicates a negative

80 percent change in the odd (.20-1) for each unit
change in the independent variable.

33
Magnitude of the Relationship . . .

Percentage change in Odd = ( Exponentiated

coefficient- 1)*100

Exponentiated .2 1 1.7
coeff (eb)
Exponentiated -.8 0.0 .7
coeff (eb) - 1

Percent -80% 0 70%

Change in
odds 34
Calculating new Odd. . .

New Odd Value = Old odd value * exp coeff * change in

independent variable

• At present, odds are 1, exp coeff is 2.35, the

independent variable changes from 5.5 to 7. What would
be the new odd

35
Calculating new Odd. . .

• New odd is 12.35(7-5.5) = 3.525

• Probability = odds / (1+odds).

• The odds of 3.525 indicates a probability of 77.9 ;
(3.52/ (1+3.52)).

36
Assessing mobile app purchasers. . .

37
Output of logistics regression. . .

38
Model Parameters. . .

39
Model Parameters. . .

40
Problem. . .

When the customer review average is 3, the odd for any game
being best seller is 3.310. Please answer the below 2
questions.

What is probability of the game being best seller when the

customer review average is 3.

If the customer review average becomes 4 from 3, what is the

increase in probability of the game being best seller?
41
Classification Acuuracy

-This shows how well the group memberships are

predicted.

- It develops a hit ratio, which is percentage correctly

specified.. This is known as CA (Classification
Accuracy).

6-42
Accuracy Paradox
 Assume an example of insurance fraud. Past data
has revealed that out of 1000 claims in the past,
950 are true claims and 50 are fraudulent claims.

 The classification table using a logistic regression

model is given below:

Observed Predicted % accuracy

0 1
0 900 50 94.73%
1 5 45 90.00%

The overall accuracy is 94.5%. Classifying all of

them as true claims will give 95% accuracy!
Sensitivity, Specificity and Precision
 The ability of the model to correctly classify positives
and negatives are called sensitivity and specificity,
respectively.
 The terminologies sensitivity and specificity originated in
medical diagnostics.
 In generic case
Sensitivity = P(model classifies Yi as positive | Yi is
positive)
Sensitivity is calculated using the following equation:

True Positive (TP)

Sensitivity = True Positive (TP)  False Negative (FN)

where True Positive (TP) is the number of positives

correctly classified as positives by the model and False
Specificity
 Specificity is the ability of the diagnostic test to
correctly classify the test as negative when the disease
is not present. That is:
Specificity = P(diagnostic test is negative | patient has no
disease)
 In general:
Sensitivity = P(model classifies Yi as negative | Yi is
negative)
Specificity can be calculated using the following
equation:

True Negative (TN)

Specificity = True Negative (TN)  False Positive (FP)

where True Negative (TN) is number of the negatives

 The decision maker has to consider the tradeoff between
sensitivity and specificity to arrive at an optimal cut-off
probability.
 Precision measures the accuracy of positives classified
by the model.
Precision = P(patient has disease | diagnostic test is
positive)

True Positive (TP)

Precision = True Positive (TP)  False Positive (FP)

 F Score (F Measure) is another measure used in binary

logistic regression that combines both precision and
recall and is given by:

2  Precision  Recall
F  Score 
Precision  Recall
Concordant and Discordant Pairs

 Discordant Pairs. A pair of positive and negative

observations for which the model has no cut-off
probability to classify both of them correctly are called
discordant pairs.

 Concordant Pairs. A pair of positive and negative

observations for which the model has a cut-off
probability to classify both of them correctly are called
concordant pairs.
Receiver Operating Characteristics (ROC)
Curve
 ROC curve is a plot between sensitivity (true
positive rate) in the vertical axis and 1 – specificity
(false positive rate) in the horizontal axis.
Area Under ROC Curve (AUC), Lorenz Curve
and Gini Coefficient

 Gini coefficient = Area A 


 Area A Area B 
1

0.9

0.8

0.7

0.6 Area A
0.5
Area B
0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ideal Wealth Distribution Actual Wealth Distribution

Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logisticregression
No ratings yet
Logisticregression
22 pages
Understanding Logistic Regression Basics
100% (7)
Understanding Logistic Regression Basics
21 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
Notes For Chapter 7
No ratings yet
Notes For Chapter 7
13 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
11 pages
Lesson 13 Logistic Regression
No ratings yet
Lesson 13 Logistic Regression
26 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Logit Regression Analysis
No ratings yet
Logit Regression Analysis
11 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logesti For Biginners
No ratings yet
Logesti For Biginners
13 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
69 pages
Logestic Regression Model
No ratings yet
Logestic Regression Model
13 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Nisha Arora - Logistics Regression Using SPSS
No ratings yet
Nisha Arora - Logistics Regression Using SPSS
76 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression Basics
No ratings yet
Logistic Regression Basics
13 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Logistic Regression for Coupon Usage
100% (1)
Logistic Regression for Coupon Usage
56 pages
Lecture 6 Logistic Regression
No ratings yet
Lecture 6 Logistic Regression
28 pages
Logistic Regression - Metrics and Iteration
No ratings yet
Logistic Regression - Metrics and Iteration
26 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Regression Analysis
No ratings yet
Regression Analysis
14 pages
Logistic Regression Analysis Overview
100% (1)
Logistic Regression Analysis Overview
5 pages
Logistic Regression
No ratings yet
Logistic Regression
72 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Business Analytics: Advance: Logistic Regression
100% (1)
Business Analytics: Advance: Logistic Regression
26 pages
Understanding Logistic Regression Basics
No ratings yet
Understanding Logistic Regression Basics
37 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
LO3 Logistic Regression1
No ratings yet
LO3 Logistic Regression1
31 pages
Logistic Regression Monograph
No ratings yet
Logistic Regression Monograph
33 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
Logistic Regression
No ratings yet
Logistic Regression
41 pages
Eml 24.7.25
No ratings yet
Eml 24.7.25
23 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Linear Regression and Logit
No ratings yet
Linear Regression and Logit
15 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Chap10 LogisticRegression
No ratings yet
Chap10 LogisticRegression
19 pages
Logistic Regression Overview
No ratings yet
Logistic Regression Overview
7 pages
Logistic Regression Course Overview
No ratings yet
Logistic Regression Course Overview
16 pages
Non-Linear Regression Guide
No ratings yet
Non-Linear Regression Guide
10 pages
Logistic Regression
No ratings yet
Logistic Regression
17 pages
Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
No ratings yet
Chapter 10 - Logistic Regression: Data Mining For Business Intelligence
20 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
4 pages
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
No ratings yet
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
170 pages
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
100% (1)
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
170 pages
Understanding Logistic Regression Basics
No ratings yet
Understanding Logistic Regression Basics
22 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
5.topic Modelling 22102020
No ratings yet
5.topic Modelling 22102020
11 pages
This Spreadsheet Supports STUDENT Analysis of The Case "Transportation and Consolidation at Elevalt LTD." (UVA-OM-1490)
No ratings yet
This Spreadsheet Supports STUDENT Analysis of The Case "Transportation and Consolidation at Elevalt LTD." (UVA-OM-1490)
7 pages
R for Data Science Beginners
No ratings yet
R for Data Science Beginners
37 pages
1.introdcution To BDSM - 21092020
No ratings yet
1.introdcution To BDSM - 21092020
44 pages
6.multiple Regressions - BDSM - 2020 - Oct
No ratings yet
6.multiple Regressions - BDSM - 2020 - Oct
45 pages
Sentiment Analysis Techniques
No ratings yet
Sentiment Analysis Techniques
34 pages
Atlantic Computers Pricing Strategy
No ratings yet
Atlantic Computers Pricing Strategy
11 pages
Group 9 Loreal
No ratings yet
Group 9 Loreal
22 pages
Overview of Chi-Squared Distribution
No ratings yet
Overview of Chi-Squared Distribution
5 pages
B2B Marketing CMR Enterprises: B.Sai Ram Susheel - 190103044 Section - B
No ratings yet
B2B Marketing CMR Enterprises: B.Sai Ram Susheel - 190103044 Section - B
2 pages
Final Project 1st Part
No ratings yet
Final Project 1st Part
4 pages
Atlantic Computers: Group 5
No ratings yet
Atlantic Computers: Group 5
8 pages
1 Sampling Distributions
No ratings yet
1 Sampling Distributions
3 pages
Bulbet Size and Cost Analysis
No ratings yet
Bulbet Size and Cost Analysis
6 pages
A Review of Probability:: 1 Concepts
No ratings yet
A Review of Probability:: 1 Concepts
2 pages
Understanding MOSP: Vision and Strategy
No ratings yet
Understanding MOSP: Vision and Strategy
11 pages
IPL Auction 2020 Player List
75% (4)
IPL Auction 2020 Player List
6 pages
Q.1What Is The Competitive Priorities For Synthite?: Input Data
No ratings yet
Q.1What Is The Competitive Priorities For Synthite?: Input Data
4 pages
Estimates of United States Total Monthly Retail and Food Services Sales
No ratings yet
Estimates of United States Total Monthly Retail and Food Services Sales
37 pages
Exhibit 3 - Materials Inventory in 2010 (April 2010 - March 2011)
100% (1)
Exhibit 3 - Materials Inventory in 2010 (April 2010 - March 2011)
54 pages
Criteria/Vendor Roberts Williams S&G Builtright Weights Delivery Quality Terms Support Price Unit Cost Score
No ratings yet
Criteria/Vendor Roberts Williams S&G Builtright Weights Delivery Quality Terms Support Price Unit Cost Score
1 page
Rainfall Prediction Analysis in Australia
No ratings yet
Rainfall Prediction Analysis in Australia
18 pages
Spon Remaining
No ratings yet
Spon Remaining
1 page
Beverage Positioning Strategy
No ratings yet
Beverage Positioning Strategy
4 pages
Presentationgo: Emerging Technologies
No ratings yet
Presentationgo: Emerging Technologies
1 page
Crescent Pure Positioning Analysis
No ratings yet
Crescent Pure Positioning Analysis
1 page
S Mcara: Workshop E-Book
No ratings yet
S Mcara: Workshop E-Book
21 pages
IT Consulting Niche for Valsangkar
No ratings yet
IT Consulting Niche for Valsangkar
2 pages
Pavement Traffic Load Analysis
No ratings yet
Pavement Traffic Load Analysis
11 pages
Cost Analysis for Accounting Students
No ratings yet
Cost Analysis for Accounting Students
19 pages
Practice Physics 1. Second Semester. Newton'S Laws: in This IB Lab You Will Be Assessed On The Following Criteria
No ratings yet
Practice Physics 1. Second Semester. Newton'S Laws: in This IB Lab You Will Be Assessed On The Following Criteria
4 pages
Artificial Intelligence Programming Language
No ratings yet
Artificial Intelligence Programming Language
2 pages
Circles
No ratings yet
Circles
27 pages
Pressure Drop Analysis in Reactors
No ratings yet
Pressure Drop Analysis in Reactors
19 pages
Exact Solutions For Free-Vibration Analysis of Rectangular Plates Using Bessel Functions
No ratings yet
Exact Solutions For Free-Vibration Analysis of Rectangular Plates Using Bessel Functions
5 pages
QT PPT Simplex 4
No ratings yet
QT PPT Simplex 4
27 pages
Robotics
No ratings yet
Robotics
15 pages
Lpic FLC
No ratings yet
Lpic FLC
19 pages
Aggregate Demand I: Building The - Model: IS LM
No ratings yet
Aggregate Demand I: Building The - Model: IS LM
30 pages
Cryptography Exam Questions and Answers
100% (1)
Cryptography Exam Questions and Answers
2 pages
Uzan-The Arrow of Time and Meaning PDF
No ratings yet
Uzan-The Arrow of Time and Meaning PDF
29 pages
3D Trigonometry Worksheet
No ratings yet
3D Trigonometry Worksheet
35 pages
Handbook of Electrochemistry G Zoski
0% (1)
Handbook of Electrochemistry G Zoski
7 pages
Application of Coefficient of Contingency Among Classification
No ratings yet
Application of Coefficient of Contingency Among Classification
12 pages
IB CS CheatSheet
No ratings yet
IB CS CheatSheet
2 pages
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
No ratings yet
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
188 pages
The instant center methodالمراكز اللحظية أخر صورة PDF
No ratings yet
The instant center methodالمراكز اللحظية أخر صورة PDF
16 pages
Mathematics AA SL Exam Paper 2
No ratings yet
Mathematics AA SL Exam Paper 2
6 pages
Equilibrium of Force System: Source: Engineering Mechanics by Ferdinand L Singer
No ratings yet
Equilibrium of Force System: Source: Engineering Mechanics by Ferdinand L Singer
7 pages
Plato's " Saving The Appearances" .
50% (2)
Plato's " Saving The Appearances" .
41 pages
CST446 Data Compression Exam Key
No ratings yet
CST446 Data Compression Exam Key
10 pages
Limits from Graphs in Calculus
No ratings yet
Limits from Graphs in Calculus
2 pages
NC 5 Prezentare Eng
No ratings yet
NC 5 Prezentare Eng
39 pages
Indices and Surds
No ratings yet
Indices and Surds
26 pages
Beams On Elastic Foundation
100% (2)
Beams On Elastic Foundation
15 pages
The Resultant of Chebyshev Polynomials PDF
No ratings yet
The Resultant of Chebyshev Polynomials PDF
9 pages
Digital Communications ECE 428
No ratings yet
Digital Communications ECE 428
8 pages
P.6 M.T.C End Term 1
No ratings yet
P.6 M.T.C End Term 1
8 pages

Logistic Regression Insights

Uploaded by

Logistic Regression Insights

Uploaded by

CLASSIFICATION PROBLEMS

 Classification problems are an important category

 The primary objective is to predict the class of a

 Examples of Classification Problems:

 Customer profiling (customer segmentation)

 Think about an important metric in

 . Rather than being normally distributed in a

 If Keepmoney attempts to use a linear

 Logistics regression is used to to represent consumers’

 When individual consumers choose products, the value

 Instead, research indicates consumer valuation of a

• Ask a group of your friends how many of them would

• Increase the discount to 1000 . Maybe one person half

• At 3000- 4000 discount, you would , you are likely to see a

• After that the number of additional purchasers will taper off,

It is less affected than discriminant analysis

• The advantages of logistic regression are

 The name logistic regression emerges from logistic

 Mathematically, logistic regression attempts to estimate

 Logistic regression models estimate how

 Logistic regression is a technique used for

 Binomial (or binary) logistic regression is a

 In multinomial logistic regression model, the

 The independent variables may be of any type.

 Probabilities are simply the likelihood that

 A probability of .2 rain means that there is a

 If there is a 20% chance of rain, there is an

 Odds = Prob(rain)/Prob (no rain) = .2/.8 =

 An 80% chance of rain has odds of .8/.2 = 4

 ODDS RATIO: Ratio of two odds.

 The logit function is defined as the natural

 Logit of a variable (probability)  (with value

If there is a 20% chance of rain, then there is a logit of ln(.25) = -1.386

• The logistic regression model is given by:

Business Analytics – The Science of Data Driven Decision Making

• Transforming the dependent variable

o The logistic transformation has two basic steps:

-We use hypothesis testing to see if the coefficient is

Wald’s test statistic is given by

-This model develops coefficient for independent

- But the interpretations are different

-Here the equation is

Log (P= 1/ P=0) = -4.087909+ .827991*gpa – (

- The coefficients B0, B1 --- are actually measures of

The magnitude of metric independent variables is

- An exponentiated coefficient of .2 indicates a negative

Percentage change in Odd = ( Exponentiated

Percent -80% 0 70%

New Odd Value = Old odd value * exp coeff * change in

• At present, odds are 1, exp coeff is 2.35, the

• New odd is 1*2.35*(7-5.5) = 3.525

• Probability = odds / (1+odds).

What is probability of the game being best seller when the

If the customer review average becomes 4 from 3, what is the

-This shows how well the group memberships are

- It develops a hit ratio, which is percentage correctly

 The classification table using a logistic regression

Observed Predicted % accuracy

The overall accuracy is 94.5%. Classifying all of

True Positive (TP)

where True Positive (TP) is the number of positives

True Negative (TN)

where True Negative (TN) is number of the negatives

True Positive (TP)

 F Score (F Measure) is another measure used in binary

 Discordant Pairs. A pair of positive and negative

 Concordant Pairs. A pair of positive and negative

 Gini coefficient = Area A 

Ideal Wealth Distribution Actual Wealth Distribution

You might also like

• New odd is 12.35(7-5.5) = 3.525