ML 03 Logistic Regression
ML 03 Logistic Regression
Machine Learning
Andrew Ng
(Yes) 1
Malignant ?
(No) 0
Tumor Size
Andrew Ng
(Yes) 1
Malignant ?
(No) 0
Tumor Size
Malignant ?
(No) 0
Tumor Size
What we need:
Logistic Regression:
Andrew Ng
Logistic Regression Model
Want
Sigmoid function 0
Logistic function
A useful property: easy to compute differential at any point
Andrew Ng
Interpretation of Hypothesis Output
= estimated probability that y = 1 on input x
Example: If
Tell patient that 70% chance of tumor being malignant
Andrew Ng
Logistic regression
Suppose predict “ “ if
predict “ “ if
Andrew Ng
Logistic regression
Andrew Ng
Separating two classes of points
• We are attempting to separate two given sets /
classes of points
• Separate two regions of the feature space
• Concept of Decision Boundary
• Finding a good decision boundary => learn
appropriate values for the parameters 𝛩
Andrew Ng
Decision Boundary
x2
3
2
1 2 3 x1
Andrew Ng
Decision Boundary
x2
3
2
1 2 3 x1
Predict if
Andrew Ng
Non-linear decision boundaries
x2
-1 1 x1
-1 We can learn more complex decision
boundaries where the hypothesis function
contains higher order terms.
Andrew Ng
Non-linear decision boundaries
x2
-1 1 x1
Predict if
-1
Andrew Ng
Cost function for
Logistic Regression
How to get the parameter values?
Training set:
m examples
Linear regression:
Andrew Ng
Logistic regression cost function
Cost
Andrew Ng
Logistic regression cost function
Andrew Ng
Logistic regression cost function
Andrew Ng
Logistic regression cost function
Andrew Ng
Gradient Descent
Want :
Repeat
Andrew Ng
Gradient Descent
Want :
Repeat
Output
Andrew Ng
How to use the estimated probability?
• Refraining from classifying unless confident
• Ranking items
• Multi-class classification
Andrew Ng
Multi-class classification:
one vs. all
Multiclass classification
News article tagging: Politics, Sports, Movies, Religion, …
Andrew Ng
Binary classification: Multi-class classification:
x2 x2
x1 x1
Andrew Ng
x2
One-vs-all (one-vs-rest):
x1
x2 x2
x1 x1
x2
Class 1:
Class 2:
Class 3:
x1
Andrew Ng
One-vs-all
Andrew Ng
Advanced Optimization algorithms (not part of this course)
Optimization algorithms:
- Gradient descent
- Conjugate gradient
- BFGS
- L-BFGS
Advantages of the other algorithms:
- No need to manually pick learning rate
- Often converges faster than gradient descent
Disadvantages:
- More complex
Andrew Ng