Chapter 3
Logistic Regression
MSc. Nguyen Khanh Loi
[email protected]
8/2023
MSc Nguyen Khanh Loi
Content
Ø Classification
Ø Hypothesis Representation
Ø Decision boundary
Ø Cost function
Ø Simplified cost function and gradient descent
Ø Advanced optimization
Ø Multi-class classification: One-vs-all
2
MSc Nguyen Khanh Loi
Classification
3
MSc Nguyen Khanh Loi
Classification
Email: Spam / Not Spam?
Online Transactions: Fraudulent (Yes / No)?
Tumor: Malignant / Benign ?
0: “Negative Class” (e.g., benign tumor)
1: “Positive Class” (e.g., malignant tumor)
Y can only one of two values -> Binary classfication
4
MSc Nguyen Khanh Loi
Classification
(Yes) 1
Malignant ?
(No) 0
Tumor Size Tumor Size
Threshold classifier output at 0.5:
If , predict “y = 1”
If , predict “y = 0”
5
MSc Nguyen Khanh Loi
Classification
Classification: y = 0 or 1
can be > 1 or < 0
Logistic Regression:
6
MSc Nguyen Khanh Loi
Hypothesis Representation
7
MSc Nguyen Khanh Loi
Hypothesis Representation
Logistic Regression Model
Want
Sigmoid function
Logistic function
8
MSc Nguyen Khanh Loi
Interpretation of Hypothesis Output
= estimated probability that y = 1 on input x
Example: If
Tell patient that 70% chance of tumor being malignant
“probability that y = 1, given x,
parameterized by ”
9
MSc Nguyen Khanh Loi
Decision boundary
10
MSc Nguyen Khanh Loi
Decision boundary
Logistic regression
Suppose predict “ “ if
predict “ “ if
11
MSc Nguyen Khanh Loi
Decision boundary
Logistic regression
When is ?
𝑔 𝑧 ≥ 0.5
𝑧≥0
𝜃 !𝑥 ≥ 0 𝜃 !𝑥 < 0
12
MSc Nguyen Khanh Loi
Decision boundary
2 features: x1, x2
Example: x2
𝜃! = −3 3
2
𝜃" = 1
1
𝜃# = 1
1 2 3 x1
−3 + 𝑥! + 𝑥" = 0
Predict “ “ if
13
MSc Nguyen Khanh Loi
Non-linear decision boundaries
x2
Choose: 𝜃! = −1, 𝜃" = 0, 𝜃# = 0, 𝜃$ = 1,
𝜃% = 1 1
ℎ& 𝑥 = 𝑔(−1 + 𝑥"# + 𝑥## ) -1 1 x1
-1
Decision boundary: 𝑧 = −1 + 𝑥"# + 𝑥## = 0
Predict “ “ if
14
MSc Nguyen Khanh Loi
Non-linear decision boundaries
x2 x2
x1 x1
15
MSc Nguyen Khanh Loi
Cost function
16
MSc Nguyen Khanh Loi
Cost function
m examples
Training set:
How to choose parameters ?
17
MSc Nguyen Khanh Loi
Cost function
Linear regression:
Linear regression Logistic regression
“convex” “non-convex”
18
MSc Nguyen Khanh Loi
Logistic regression cost function
If y = 1
0 1
19
MSc Nguyen Khanh Loi
Logistic regression cost function
If y = 0
0 1
20
MSc Nguyen Khanh Loi
Simplified cost function and gradient descent
21
MSc Nguyen Khanh Loi
Logistic regression cost function
maximum likelihood
To fit parameters :
To make a prediction given new :
Output
22
MSc Nguyen Khanh Loi
Gradient Descent
Want :
Repeat
(simultaneously update all )
23
MSc Nguyen Khanh Loi
Gradient Descent
Want :
Repeat
(simultaneously update all )
Algorithm looks identical to linear regression!
24
MSc Nguyen Khanh Loi
Advanced optimization
25
MSc Nguyen Khanh Loi
Optimization algorithm
Cost function . Want .
Given , we have code that can compute
-
- (for )
Gradient descent:
Repeat
26
MSc Nguyen Khanh Loi
Optimization algorithm
Given , we have code that can compute
-
- (for )
Optimization algorithms: Advantages:
- Gradient descent - No need to manually pick
- Conjugate gradient - Often faster than gradient
- BFGS descent.
- L-BFGS Disadvantages:
- More complex
27
MSc Nguyen Khanh Loi
Multi-class classification: One-vs-all
28
MSc Nguyen Khanh Loi
Multiclass classification
Email foldering/tagging: Work, Friends, Family, Hobby
Medical diagrams: Not ill, Cold, Flu
Weather: Sunny, Cloudy, Rain, Snow
29
MSc Nguyen Khanh Loi
Multiclass classification
Binary classification: Multi-class classification:
x2 x2
x1 x1
30
MSc Nguyen Khanh Loi
One-vs-all (one-vs-rest):
x2
x1
x2 x2
x1 x1
x2
Class 1:
Class 2:
Class 3:
x1
31
MSc Nguyen Khanh Loi
One-vs-all
Train a logistic regression classifier for each
class to predict the probability that .
On a new input , to make a prediction, pick the
class that maximizes
32
MSc Nguyen Khanh Loi
Practice 1
As an example of simple logistic regression, Suzuki et al. (2006) measured
sand grain size on 28 beaches in Japan and observed the presence or
absence of the burrowing wolf spider Lycosa ishikariana on each beach.
Grain size (mm) Spiders Grain size (mm) Spiders
0.245 absent 0.432 absent
0.247 absent 0.473 present
0.285 present 0.509 present
0.299 present 0.529 present
0.327 present 0.561 absent
0.347 present 0.569 absent
0.356 absent 0.594 present
0.36 present 0.638 present
0.363 absent 0.656 present
0.364 present 0.816 present
0.398 absent 0.853 present
0.4 present 0.938 present
0.409 absent 1.036 present
0.421 present 1.045 present
33
MSc Nguyen Khanh Loi