0% found this document useful (0 votes)

8 views4 pages

Subtitle

The document discusses the concepts of overfitting and underfitting in machine learning, using linear regression and logistic regression as examples. Overfitting occurs when a model fits the training data too closely, resulting in poor generalization to new data, while underfitting happens when a model fails to capture the underlying patterns in the data. The document emphasizes the importance of finding a balance between bias and variance to achieve a model that generalizes well, and introduces regularization as a technique to mitigate overfitting.

Uploaded by

ArMix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

Subtitle

Uploaded by

ArMix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

Now you've seen a couple of different

learning algorithms, linear regression and

logistic regression. They work well for many tasks. But sometimes in an
application, the algorithm can run into a
problem called overfitting, which can cause it
to perform poorly. What I like to do
in this video is to show you what is overfitting, as well as a closely-related,
almost opposite problem
called underfitting. In the next videos after this, I'll share with you
some techniques for accuracy overfitting. In particular, there's a
method called regularization. Very useful technique. I use it all the time. Then
regularization
will help you minimize this overfitting problem and get your learning algorithms
to work much better. Let's take a look at
what is overfitting? To help us understand
what is overfitting. Let's take a look
at a few examples. Let's go back to our
original example of predicting housing prices
with linear regression. Where you want to
predict the price as a function of the
size of a house. To help us understand
what is overfitting, let's take a look at a
linear regression example. I'm going to go back to our
original running example of predicting housing prices
with linear regression. Suppose your data-set
looks like this, with the input feature x
being the size of the house, and the value, y that you're trying to predict the
price of the house. One thing you could do is fit a linear function to this data.
If you do that, you get a straight line fit to the data that maybe
looks like this. But this isn't a
very good model. Looking at the data, it seems pretty clear that as the size of the
house increases, the housing process
flattened out. This algorithm does not fit
the training data very well. The technical term for this is the model is
underfitting
the training data. Another term is the
algorithm has high bias. You may have read
in the news about some learning algorithms really, unfortunately,
demonstrating bias against certain ethnicities
or certain genders. In machine learning, the term
bias has multiple meanings. Checking learning algorithms
for bias based on characteristics
such as gender or ethnicity is
absolutely critical. But the term bias has a second technical
meaning as well, which is the one I'm using here, which is if the algorithm
has underfit the data, meaning that it's
just not even able to fit the training set that well. There's a clear pattern in
the training data that the algorithm is just
unable to capture. Another way to think
of this form of bias is as if the learning algorithm has a very strong
preconception, or we say a very strong bias, that the housing
prices are going to be a completely linear function of the size despite data
to the contrary. This preconception that
the data is linear causes it to fit a straight line that
fits the data poorly, leading it to underfitted data. Now, let's look at a second
variation of a model, which is if you insert for a quadratic function at the
data with two features, x and x^2, then when you fit the
parameters W1 and W2, you can get a curve that fits
the data somewhat better. Maybe it looks like this. Also, if you were
to get a new house, that's not in this set of
five training examples. This model would probably do quite well on that new house.
If you're real estate agents, the idea that you want your learning
algorithm to do well, even on examples that are not on the training set, that's
called generalization. Technically we say that you want your learning algorithm
to generalize well, which means to make good
predictions even on brand new examples that
it has never seen before. These quadratic
models seem to fit the training set not
perfectly, but pretty well. I think it would generalize
well to new examples. Now let's look at
the other extreme. What if you were to fit a fourth-order
polynomial to the data? You have x, x^2, x^3, and x^4 all as features. With this
fourth
for the polynomial, you can actually fit
the curve that passes through all five of the
training examples exactly. You might get a curve
that looks like this. This, on one hand, seems to do an extremely
good job fitting the training data because it passes through all of the
training data perfectly. In fact, you'd be able to choose parameters that will
result
in the cost function being exactly equal
to zero because the errors are zero on all
five training examples. But this is a very wiggly curve, its going up and down
all over the place. If you have this whole
size right here, the model would predict
that this house is cheaper than houses that are
smaller than it. We don't think that this is a particularly good model for
predicting housing prices. The technical term
is that we'll say this model has overfit the data, or this model has an
overfitting problem. Because even though it fits
the training set very well, it has fit the data almost
too well, hence is overfit. It does not look
like this model will generalize to new examples
that's never seen before. Another term for this is that the algorithm
has high variance. In machine learning, many people will use
the terms over-fit and high-variance
almost interchangeably. We'll use the terms
underfit and high bias almost interchangeably. The intuition behind overfitting or
high-variance is
that the algorithm is trying very hard to fit every
single training example. It turns out that if your training set were just
even a little bit different, say one holes was priced just a little bit
more little bit less, then the function
that the algorithm fits could end up being
totally different. If two different machine
learning engineers were to fit this fourth-order
polynomial model, to just slightly
different datasets, they couldn't end up with
totally different predictions or highly variable predictions. That's why we say the
algorithm has high variance. Contrasting this
rightmost model with the one in the middle
for the same house, it seems, the middle
model gives them much more reasonable
prediction for price. There isn't really a name
for this case in the middle, but I'm just going to
call this just right, because it is neither
underfit nor overfit. You can say that the goal
machine learning is to find a model that hopefully is neither underfitting
nor overfitting. In other words, hopefully, a model that has neither
high bias nor high variance. When I think about
underfitting and overfitting, high bias and high variance. I'm sometimes reminded
of
the children's story of Goldilocks and the Three
Bears in this children's tale, a girl called Goldilocks visits the home
of a bear family. There's a bowl of
porridge that's too cold to taste and
so that's no good. There's also a bowl of porridge
that's too hot to eat. That's no good either. But there's a bowl
of porridge that is neither too cold nor too hot. The temperature
is in the middle, which is just right to eat. To recap, if you have too many
features like the fourth-order
polynomial on the right, then the model may fit
the training set well, but almost too well or overfit
and have high variance. On the flip side if you
have too few features, then in this example,
like the one on the left, it underfits and has high bias. In this example, using
quadratic features
x and x squared, that seems to be just right. So far we've looked
at underfitting and overfitting for linear
regression model. Similarly, overfitting applies
a classification as well. Here's a classification example with two features, x_1
and x_2, where x_1 is maybe the tumor size and x_2
is the age of patient. We're trying to classify if a tumor is malignant or benign,
as denoted by these
crosses and circles, one thing you could do is fit a logistic regression model.
Just a simple model like
this, where as usual, g is the sigmoid function and
this term here inside is z. If you do that, you end up with a straight
line as the decision boundary. This is the line
where z is equal to zero that separates the
positive and negative examples. This straight line
doesn't look terrible. It looks okay, but it doesn't look like a very good fit to
the data either. This is an example of
underfitting or of high bias. Let's look at another example. If you were to add
to your features these quadratic terms, then z becomes this new term in the middle
and the
decision boundary, that is where z equals zero
can look more like this, more like an ellipse
or part of an ellipse. This is a pretty good
fit to the data, even though it
does not perfectly classify every single training example in the training set.
Notice how some of these crosses get classified
among the circles. But this model
looks pretty good. I'm going to call it just right. It looks like this generalized
pretty well to new patients. Finally, at the other extreme, if you were to fit a
very high-order polynomial with many features like these, then the model may try
really
hard and contoured or twist itself to find a
decision boundary that fits your training
data perfectly. Having all these higher-order
polynomial features allows the algorithm to choose this really over the
complex decision boundary. If the features are
tumor size in age, and you're trying to classify tumors as malignant or benign,
then this doesn't
really look like a very good model for
making predictions. Once again, this
is an instance of overfitting and high
variance because its model, despite doing very well
on the training set, doesn't look like it'll
generalize well to new examples. Now you've seen how an
algorithm can underfit or have high bias or overfit
and have high variance. You may want to know
how you can give get a model that is just right. In the next video, we'll look at
some ways you can address the issue
of overfitting. We'll also touch on some ideas relevant for using underfitting.
Let's go on to the next video.

Understanding Overfitting and Underfitting
100% (1)
Understanding Overfitting and Underfitting
4 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
61 pages
Machine Learning Notes Anna University
No ratings yet
Machine Learning Notes Anna University
9 pages
Subtitle
No ratings yet
Subtitle
3 pages
DL Unit1
100% (2)
DL Unit1
79 pages
Advanced Linear Regression Guide
No ratings yet
Advanced Linear Regression Guide
45 pages
ML Tips and Tricks
No ratings yet
ML Tips and Tricks
32 pages
Machine Learning Model Validation
No ratings yet
Machine Learning Model Validation
50 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
Diagnosing Bias Vs Variance
No ratings yet
Diagnosing Bias Vs Variance
11 pages
10 Advice For Applying Machine Learning
No ratings yet
10 Advice For Applying Machine Learning
25 pages
Bias and Variance
No ratings yet
Bias and Variance
4 pages
Underfittingand Extra Notes
No ratings yet
Underfittingand Extra Notes
11 pages
Module 04
No ratings yet
Module 04
16 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
Understanding Overfitting and Underfitting
No ratings yet
Understanding Overfitting and Underfitting
2 pages
Regularization Linear Models
No ratings yet
Regularization Linear Models
23 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
97 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
Linear Regression With Multiple Variable
No ratings yet
Linear Regression With Multiple Variable
30 pages
Data Science Concepts Overfitting Underfitting
100% (1)
Data Science Concepts Overfitting Underfitting
8 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Overfitting Underfitting Bias Variance
No ratings yet
Overfitting Underfitting Bias Variance
11 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
UNIT - 5 Data Science
No ratings yet
UNIT - 5 Data Science
34 pages
CMPE257 - W2C3 - ML Fundamentals - Part 2
No ratings yet
CMPE257 - W2C3 - ML Fundamentals - Part 2
34 pages
Underfitting and Overfitting Slides and Transcript
No ratings yet
Underfitting and Overfitting Slides and Transcript
13 pages
Linear Regression, Polynomical, Gradiant Descent
No ratings yet
Linear Regression, Polynomical, Gradiant Descent
42 pages
Overfitting & Underfitting in Machine Learning
No ratings yet
Overfitting & Underfitting in Machine Learning
9 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
Lec3 Linear Regression With Multiple Vars
No ratings yet
Lec3 Linear Regression With Multiple Vars
30 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Uf, Of, Bias-Variance Tradeoff
No ratings yet
Uf, Of, Bias-Variance Tradeoff
3 pages
DL Unit1
100% (1)
DL Unit1
61 pages
Unit 2 ML Regression
No ratings yet
Unit 2 ML Regression
46 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Machine Learning Juunit2.pdf Lands
No ratings yet
Machine Learning Juunit2.pdf Lands
7 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
Evaluating Machine Learning Algorithms
100% (2)
Evaluating Machine Learning Algorithms
42 pages
Overfitting vs Underfitting in ML
No ratings yet
Overfitting vs Underfitting in ML
20 pages
UNIT 2 Data Science LM 2023
No ratings yet
UNIT 2 Data Science LM 2023
13 pages
Machine Learning Model Validation Insights
No ratings yet
Machine Learning Model Validation Insights
15 pages
Class Test1,2,3-Answer Key
No ratings yet
Class Test1,2,3-Answer Key
23 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Machine Learning Concepts Explained
No ratings yet
Machine Learning Concepts Explained
4 pages
Subtitle
No ratings yet
Subtitle
3 pages
(Technical) Machine Learning U3-6 (2019 Pattern)
No ratings yet
(Technical) Machine Learning U3-6 (2019 Pattern)
101 pages
Machine Learning Application Tips
No ratings yet
Machine Learning Application Tips
8 pages
08 Eval-Intro Notes
No ratings yet
08 Eval-Intro Notes
10 pages
Lec 3
No ratings yet
Lec 3
31 pages
Bias Variance
No ratings yet
Bias Variance
14 pages
Lecture 8
No ratings yet
Lecture 8
15 pages
0s3 8MA0-01 Pure 1 - Mock Set 3 Mark Schememe PDF
No ratings yet
0s3 8MA0-01 Pure 1 - Mock Set 3 Mark Schememe PDF
22 pages
AM Directional Antenna Patterns by Carl E. Smith, January, 1958.
100% (2)
AM Directional Antenna Patterns by Carl E. Smith, January, 1958.
276 pages
PL/SQL Composite Data Types Guide
No ratings yet
PL/SQL Composite Data Types Guide
0 pages
FPGA-Based Secure Key Exchange
No ratings yet
FPGA-Based Secure Key Exchange
47 pages
Q2 WK No.2 LAS Math10 Finalcopy1
No ratings yet
Q2 WK No.2 LAS Math10 Finalcopy1
8 pages
C Programming Size and Memory Questions
No ratings yet
C Programming Size and Memory Questions
5 pages
User Manual: IDA Indoor Climate and Energy
No ratings yet
User Manual: IDA Indoor Climate and Energy
179 pages
Fluids 3 Pipe Network Assignment
No ratings yet
Fluids 3 Pipe Network Assignment
13 pages
Grade 9 Atp 2025
No ratings yet
Grade 9 Atp 2025
4 pages
Mechanical Logic Devices Overview
No ratings yet
Mechanical Logic Devices Overview
5 pages
Transformation Homework Ks3
100% (1)
Transformation Homework Ks3
7 pages
GR 8 June 2019 Paper 2
No ratings yet
GR 8 June 2019 Paper 2
5 pages
The Man Who Knew Infinity
No ratings yet
The Man Who Knew Infinity
3 pages
Nastran Output File
No ratings yet
Nastran Output File
8 pages
Sequences and Series: Geometric Progression
No ratings yet
Sequences and Series: Geometric Progression
16 pages
PROJET
No ratings yet
PROJET
105 pages
Idk 2
No ratings yet
Idk 2
3 pages
Surds: A Comprehensive Guide
100% (1)
Surds: A Comprehensive Guide
6 pages
Number Systems
No ratings yet
Number Systems
165 pages
Compare: A Journal of Comparative and International Education
No ratings yet
Compare: A Journal of Comparative and International Education
29 pages
13 Normal
No ratings yet
13 Normal
10 pages
Time-Temperature Relations in Tempering Steel: (New 1pqj1)
100% (1)
Time-Temperature Relations in Tempering Steel: (New 1pqj1)
27 pages
Actuarial Science: State Transitions
No ratings yet
Actuarial Science: State Transitions
13 pages
GW3rdchp2prbs PDF
No ratings yet
GW3rdchp2prbs PDF
4 pages
Physics Neet Rank Booster Vol 4
No ratings yet
Physics Neet Rank Booster Vol 4
68 pages
VLR, Vla, Vlax Functions
No ratings yet
VLR, Vla, Vlax Functions
42 pages
M SC Physics Syllabus 29th July 23 (3rd Anniversary of NEP)
No ratings yet
M SC Physics Syllabus 29th July 23 (3rd Anniversary of NEP)
42 pages
Introducing Game Theory and Its Applications ELIOTT MENDELSON DANIEL ZWILLINGERpreviewpdf
No ratings yet
Introducing Game Theory and Its Applications ELIOTT MENDELSON DANIEL ZWILLINGERpreviewpdf
40 pages
Khanh An Vu - MDM4U - Culminating
No ratings yet
Khanh An Vu - MDM4U - Culminating
10 pages
A Practical Equation For Elastic Modulus of Concrete: Aci Structural Journal Technical Paper
No ratings yet
A Practical Equation For Elastic Modulus of Concrete: Aci Structural Journal Technical Paper
25 pages

Subtitle

Uploaded by

Subtitle

Uploaded by

Now you've seen a couple of different

learning algorithms, linear regression and

You might also like