0% found this document useful (0 votes)
28 views17 pages

Deep Learning 2

The document discusses the use of logistic regression and multilayer perceptrons in machine learning, highlighting the importance of deep learning for capturing complex relationships in data. It emphasizes the issue of overfitting, where a model may perform well on training data but poorly on real-world data, and outlines standard validation strategies to assess model performance. The document also explains the significance of training, validation, and test sets in evaluating machine learning models.

Uploaded by

adel.surface
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views17 pages

Deep Learning 2

The document discusses the use of logistic regression and multilayer perceptrons in machine learning, highlighting the importance of deep learning for capturing complex relationships in data. It emphasizes the issue of overfitting, where a model may perform well on training data but poorly on real-world data, and outlines standard validation strategies to assess model performance. The document also explains the significance of training, validation, and test sets in evaluating machine learning models.

Uploaded by

adel.surface
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Logistic Regression Multilayer Perceptron

σ(yi)
σ(zi) yi

zi
σ(ζ ) σ(ζ )
i1 iJ
ζ ζ
bM i1 iJ

b1
σ(zi1) σ(ziK)
zi1 ziK

xi1 xi2 features of data xiM


xi1 xi2 xiM
features of data
Complex Relationships
Using Deep Learning

Can be captured by using deep


neural networks
Can be represented accurately
and predicted well
x2 Can give perfect performance
in the training set
Can perform poorly in the
real world

Needs to be validated

x1
Overfitting is when the learned model
increases complexity to fit the observed
training data too well
Will not work on future data in the
real world
observation
4

Want to come up with function


to predict observation given x 3

f (x)
2

0
0 1 2 3 4
x
Increasing Polynomial Order

observation 1–order fit observation 3–order fit observation 8–order fit


4 4 4

3 3 3
f (x)

f (x)

f (x)
2 2 2

1 1 1

0 0 0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
x x x
Multilayer Perceptron
Problems with Overfitting
σ(yi)
yi Increasing parameters
increases error rate
σ(ζ ) σ(ζ ) Complex relationship may be
i1 iJ
ζ ζ too complex for reality
i1 iJ

σ(zi1) σ(ziK)
zi1 ziK

xi1 xi2 xiM


features of data
Standard Validation Strategy
Training Set
Problems with Overfitting
x1 y1
x2 y2 Increasing parameters
increases error rate
x3 y3
Complex relationship may be
x4 y4
too complex for reality
Models and analysis are
not generalized
xN – 1 yN – 1
xN yN

(b0, b1,… bM)


Standard Validation Strategy
Training Set
Problems with Overfitting
x1 y1
x2 y2 Increasing parameters
increases error rate
x3 y3
Complex relationship may be
x4 y4
too complex for reality
Models and analysis are
not generalized
xN – 1 yN – 1
xN yN

(b0, b1,… bM)


how well will this work in the real world?
Split
Standard
Data Validation
in SeparateStrategy
Groups
Training Set New Real-World Data
x1 y1 x1 y1
x2 y2 x2 y2
x3 y3 x3 y3
x4 y4 x4 y4

xN – 1 yN – 1 xN – 1 yN – 1
xN yN xN yN

(b0, b1,… bM)


estimate real-world performance
Split
Standard
Data Validation
in SeparateStrategy
Groups

Is costly, can we use existing data to


estimate performance?
Split Data in Separate Groups

x1 y1 x1 y1
x2 y2 x2 y2
x3 y3 x3 y3
x4 y4 random x4 y4
assignment

xN – 1 yN – 1 xN – 1 yN – 1
xN yN xN yN
all available data training validation testing
Split Data in Separate Groups
x1 y1
x2 y2
x3 y3
x1 y1 x4 y4
x2 y2
xN – 1 yN – 1
testing
x3 y3 x1 y1
xN yN
x2 y2
x4 y4 training x3 y3
x4 y4
validation
x1 y1 x
yN – 1
xN – 1 yN – 1 x2 y2 x
N–1

N yN
x3 y3
xN yN x4 y4

all available data


xN – 1 yN – 1
xN yN
Test Set
x1 y1
x2 y2 Standard practice in
machine learning
x3 y3
Created prior to any analysis
x4 y4
Will never be used to learn
or fit any parameters

Can evaluate performance


xN – 1 yN – 1
of network on test set
xN yN Analogous to running a
new experiment
all available data
Test Set
x1 y1
x2 y2 Should ideally only be used once
x3 y3
Reusing a test set will lead to bias
x4 y4
Bias results will lead to optimistic
performance estimates

xN – 1 yN – 1
xN yN

all available data


Validation Set
x1 y1
x2 y2 Can be used to compare
which approach is best
x3 y3
Not used to learn parameters
x4 y4
Used repeatedly to estimate the
performance of a model

Can be used to pick out the


xN – 1 yN – 1
best performance model
xN yN

all available data


training
x1 y1
x2 y2 refine model
x3 y3
x4 y4
validation testing
x1 y1 x1 y1
x2 y2 x2 y2
xN – 1 yN – 1 x3 y3 x3 y3
x4 y4 x4 y4
xN yN

xN – 1 yN – 1 xN – 1 yN – 1
xN yN xN yN
(b0, b1,… bM)

estimate performance on validation set final performance evaluation

You might also like