0% found this document useful (0 votes)
32 views21 pages

Chapter 4 - Overfitting

Uploaded by

hung.tran301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views21 pages

Chapter 4 - Overfitting

Uploaded by

hung.tran301
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Chapter 4

Overfitting
MSc. Nguyen Khanh Loi
[email protected]
8/2023
MSc Nguyen Khanh Loi
Content

Ø The problem of overfitting


Ø Addressing overfitting
Ø Cost function with Regularization
Ø Regularized linear regression
Ø Regularized logistic regression

2
MSc Nguyen Khanh Loi
The problem of overfitting

3
MSc Nguyen Khanh Loi
The problem of overfitting

Example: Linear regression (housing prices)


Price

Price

Price
Size Size Size

Underfit/High bias Just right/Generalization Overfit/High variance


Does not fit the Fits training set Fits training set
training set well pretty well extremly well
Overfitting: If we have too many features, the learned hypothesis
may fit the training set very well ( ), but fail
to generalize to new examples (predict prices on new examples).
4
MSc Nguyen Khanh Loi
The problem of overfitting

Example: Logistic regression

x2 x2 x2

x1 x1 x1

( = sigmoid function)

Underfit Good Overfit

5
MSc Nguyen Khanh Loi
Addressing overfitting

6
MSc Nguyen Khanh Loi
Addressing overfitting

Collect more training examples

x x x x
xx
Price

Price
x
x
x
x
x
x
x
x

Size Size

7
MSc Nguyen Khanh Loi
Addressing overfitting

Select features to include/exclude

size of house
no. of bedrooms
no. of floors

Price
age of house
average income in neighborhood
kitchen size
Size

- Many features - Selected features


- Insufficient data - Size, bedrooms, age
→ overfit → just right
Feature selection
8
MSc Nguyen Khanh Loi
Addressing overfitting

Regularization overfit

Price
ℎ! 𝑥 = 20𝑥 − 302𝑥 " + 22𝑥 # − 111𝑥 $ + 95

Large values for 𝜃%

features

regularization

ℎ! 𝑥 = 4.5𝑥 − 3.2𝑥 " + 0.0013𝑥 # − 0.0011𝑥 $ + 8.3

Price
Small values for 𝜃%

features
9
MSc Nguyen Khanh Loi
Cost function with Regularization

10
MSc Nguyen Khanh Loi
Cost function
Intuition

Price
Price

Size of house Size of house

Suppose we penalize and make , really small.


%
1 &
min ' ℎ! 𝑥 "
−𝑦 " +1000𝜃'& + 1000𝜃(&
! 2𝑚
"#$

11
MSc Nguyen Khanh Loi
Cost function
Regularization

size of house Small values for parameters


no. of bedrooms ― “Simpler” hypothesis
no. of floors ― Less prone to overfitting
age of house ― Features:
― Parameters:
kitchen size

12
MSc Nguyen Khanh Loi
Cost function
Regularization

Mean squared error Regularization term


Fit data Keep 𝜃% small

𝜆 balances both
goals

13
MSc Nguyen Khanh Loi
Cost function
Regularization

If 𝜆 ≅ 0

Price

Features

14
MSc Nguyen Khanh Loi
Cost function
Regularization

If 𝜆 very large ⟹ 𝜃 ≅ 0
𝜃&

Price

Features

15
MSc Nguyen Khanh Loi
Regularized linear regression

16
MSc Nguyen Khanh Loi
Regularized linear regression

Gradient descent:
Repeat

(simultaneously update for every )

17
MSc Nguyen Khanh Loi
Regularized linear regression

Gradient descent

Repeat

Usual update

18
MSc Nguyen Khanh Loi
Regularized logistic regression

19
MSc Nguyen Khanh Loi
Regularized logistic regression

Regularized logistic regression.

x2

x1
Cost function:

20
MSc Nguyen Khanh Loi
Regularized logistic regression
Gradient descent

Repeat

Looks same as linear regression

21
MSc Nguyen Khanh Loi

You might also like