0% found this document useful (0 votes)

58 views48 pages

Lecture 7 - Feature Selection & Model Optimization

Uploaded by

22028007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views48 pages

Lecture 7 - Feature Selection & Model Optimization

Uploaded by

22028007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 7: Model Optimization

Hanoi, 10/2024
Outline
● True Error versus Empirical Error
● Overfitting, Underfitting
● Bias-Variance Tradeoff
● Model Optimization
○ Feature Selection
○ Regularization
○ Model Ensemble
FIT-CS INT3405E - Machine Learning 2
Recap: Support Vector Machines (SVM)

Support
Vectors

FIT-CS INT3405E - Machine Learning 3

Recap: Soft Margin SVM
●Standard Linear SVM
○Introduce slack variables
○Relax the constraints
○Penalize the relaxation
Primal
Problem:

C is a regularization parameter. Soft margin SVM trade off between

maximizing the margin and minimizing the misclassification error rate

FIT-CS INT3405E - Machine Learning 4

True Error versus Empirical Error
• True Error/Risk: target performance measure
• Classification: probability of misclassification
• Regression: mean squared error
• Performance on a random test/unseen point (X,Y)
• Empirical Error/risk: performance on training data
• Classification: proportion of misclassified examples

• Regression: average squared error

FIT-CS INT3405E - Machine Learning 5

True Error versus Empirical Error

FIT-CS INT3405E - Machine Learning 6

Overfitting
Example: Linear regression (housing prices)
Price

Price

Price
Size Size Size

Overfitting: If we have too many features (complicated predictor), the

learned hypothesis may fit the training set very well, but fail to generalize
to new examples (predict prices on new examples).
FIT-CS INT3405E - Machine Learning 7
Overfitting versus Underfitting
Example: Logistic regression

y y y

x x x
1

( = sigmoid function)

“Underfitting” “Overfitting”
FIT-CS INT3405E - Machine Learning 8
Model Complexity

Error
True
Error

Empirical Error
model
complexity
underfitting Best model overfittin
g
Empirical error (training error) is no longer a good indicator of true error

FIT-CS INT3405E - Machine Learning 9

Examples of Model Complexity
● Examples of Model Spaces with increasing complexity:
○ Regression with polynomials of order k=0,1,2,…

Higher degree => higher complexity

○ Decision Trees with depth k or with k leaves

Higher depth/ More # leaves => Higher complexity

○ KNN classifiers with varying neighbourhood sizes k =1,2,3…

Small neighbourhood => Higher complexity

FIT-CS INT3405E - Machine Learning 10

Risk Analysis (1)
• True Error/Risk vs Empirical Error/Risk

• Optimal Predictor

• Empirical Error Minimization over class

FIT-CS INT3405E - Machine Learning 11

Risk Analysis (2)
• Excess Risk

Estimation error Approximation error

Due to randomness Due to restriction
of training data (finite sample size) of model class

Estimation Excess risk

error
Approx. error
FIT-CS INT3405E - Machine Learning 12
Risk Analysis (2)

Estimation error Approximation error

Estimation Excess risk

error

Approx. error

FIT-CS INT3405E - Machine Learning 13

Bias - Variance Trade-off

• Regression:

Notice that Optimal predictor

does not have zero error

FIT-CS INT3405E - Machine Learning 14

Bias

FIT-CS INT3405E - Machine Learning 15

Variance

FIT-CS INT3405E - Machine Learning 16

Bias and Variance: Intuition

Low Variance High Variance

Underfitting

High Bias

Low Bias

Overfitting

FIT-CS INT3405E - Machine Learning 17

Bias and Variance: Intuition

Low Variance High Variance

Underfitting

High Bias

Low Bias

Overfitting

FIT-CS INT3405E - Machine Learning 18

Bias -Variance Trade-off
• High bias, Low variance – poor approximation
but robust/stable

3 Independent
• Low bias, high variance – good approximation training
but instable datasets

FIT-CS INT3405E - Machine Learning 19

Model Optimization

Error
True
Error

Empirical Error
model
complexity
underfitting Best model overfittin
g

FIT-CS INT3405E - Machine Learning 20

Learning Curve

High High
Bias Varianc
e

error
error

Test/CV
Error
high Test/CV
error Error
Training large
Error gap
Training
Error
(training set size) (training set size)

FIT-CS INT3405E - Machine Learning 21

How to address Over-fitting
● Reduce number of features
○ Feature selection
○ Model selection algorithms
● Regularization
○ Incorporate model complexity for optimization, penalize
complex models using prior knowledge
○ Keep all the features, but reduce magnitude/values of model
parameters
○ Works well when we have a lot of features, each of which
contributes a bit to the prediction

FIT-CS INT3405E - Machine Learning 22

Feature Selection
Idea: Find the best set of features that allows one to build optimized
models => Reduce the model complexity,

FIT-CS INT3405E - Machine Learning 23

Feature Selection Methods

FIT-CS INT3405E - Machine Learning 24

Supervised Feature Selection

FIT-CS INT3405E - Machine Learning 25

Feature Selection - Filter Methods

Idea: Compute the importance of the feature => Choose the most
important ones
FIT-CS INT3405E - Machine Learning 26
Feature Selection - Information Gain

FIT-CS INT3405E - Machine Learning 27

Feature Selection - Chi-square

FIT-CS INT3405E - Machine Learning 28

Feature Selection - Wrapper Methods

Idea: Gradually choose the most important features

FIT-CS INT3405E - Machine Learning 29

Feature Selection - Regularization
• Regularized learning framework

Cost of model / model complexity

• Penalize complex models using prior knowledge.
• Two Examples
• Regularized Linear Regression (rigid regression)
• Regularized Logistic Regression

FIT-CS INT3405E - Machine Learning 30

Regularized Linear Regression
• Linear Regression

• Regularized Linear Regression

• Choice of regularizer
• “Rigid Regression”

• “Lasso” (Least absolute shrinkage and selection operator)

FIT-CS INT3405E - Machine Learning 31

Regularized Logistic Regression

• -regularized Logistic Regression

• -regularized Logistic Regression (“Sparse Logistic

Regressions”)

FIT-CS INT3405E - Machine Learning 32

Model Ensemble
• Basic Idea: Instead of learning one
mode, learning several and
combine them

• Typically improves the accuracy,

often by a lot

FIT-CS INT3405E - Machine Learning 33

Why does It Work?
Suppose there are 25 base classifiers
• Each classifier has error rate, ε = 0.35
• Assume classifiers are independent
• Probability that the ensemble classifier makes a wrong
prediction (i.e.,13 out of the 25 classifiers misclassified)

FIT-CS INT3405E - Machine Learning 34

Bagging Classifiers
• In general, sampling from p(h|D)
is difficult
• P(h|D) is difficult to compute
• P(h|D) is impossible to
compute for non-probabilistic
classifier such as SVM
• Bagging Classifiers:
• Realize sampling p(h|D) by
sampling training examples

FIT-CS INT3405E - Machine Learning 35

Boostrap Sampling
• Bagging = Boostrap aggregating
• Boostrap sampling: given set D containing n training examples
• Create Di by drawing n examples at random with
replacement from D
• Di expects to leave out about 0.37 of examples from D

FIT-CS INT3405E - Machine Learning 36

Bagging
• Sampling with replacement

• Build classifier on each bootstrap sample

• Each sample has probability (1 – 1/n) of being remained
• All training data has probability (1 – 1/n)^n of being remained
• This value tends to be 1/e ~ 0.37 for large n

FIT-CS INT3405E - Machine Learning 37

Bagging Algorithm

FIT-CS INT3405E - Machine Learning 38

Bagging ~ Bayesian Average

FIT-CS INT3405E - Machine Learning 39

Inefficiency with Bagging

• Inefficient boostrap sampling:

• Every example has equal chance to be
sampled
• No distinction between “easy”
examples and “difficult” examples
• Inefficient model combination:
• A constant weight for each classifier
• No distinction between accurate
classifiers and inaccurate classifiers

FIT-CS INT3405E - Machine Learning 40

Improve the Efficiency of Bagging
• Better sampling strategy
• Focus on the examples that are difficult to classify

• Better combination strategy

• Accurate model should be assigned larger weights

FIT-CS INT3405E - Machine Learning 41

Intuition

FIT-CS INT3405E - Machine Learning 42

Boosting: Example
• Instances that are wrongly classified will have their weights increased
• Instances that are correctly classified will have their weights decreased

• Example 4 is hard to classify

• Its weight is increased, therefore it is more likely to be
chosen again in subsequent rounds

FIT-CS INT3405E - Machine Learning 43

AdaBoost

FIT-CS INT3405E - Machine Learning 44

AdaBoost Example

FIT-CS INT3405E - Machine Learning 45

Stacking

FIT-CS INT3405E - Machine Learning 46

Summary
● True Error versus Empirical Error
● Overfitting, Underfitting
● Bias-Variance Tradeoff
● Model Optimization
○ Feature Selection
○ Regularization
○ Model Ensemble
FIT-CS INT3405E - Machine Learning 47
UET
Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

Thank you

Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
SVM Classification Techniques in ML
No ratings yet
SVM Classification Techniques in ML
44 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Machine Learning: Linear Regression
No ratings yet
Machine Learning: Linear Regression
55 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
08 Classification
No ratings yet
08 Classification
46 pages
Machine Learning Course Notes
No ratings yet
Machine Learning Course Notes
112 pages
Notes Cce 577
No ratings yet
Notes Cce 577
71 pages
DM 09 Classification and Prediction 19112024 102854am
No ratings yet
DM 09 Classification and Prediction 19112024 102854am
21 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
54 pages
Week 4 Lecture Slides BUS265 2023
No ratings yet
Week 4 Lecture Slides BUS265 2023
45 pages
Chp8 Classification Basic Concepts - Lecture#8
No ratings yet
Chp8 Classification Basic Concepts - Lecture#8
40 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Preface To The Second Edition V 1 1
No ratings yet
Preface To The Second Edition V 1 1
9 pages
ES335
No ratings yet
ES335
22 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Amlt Bca Unit-1
No ratings yet
Amlt Bca Unit-1
24 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
ML and DL
No ratings yet
ML and DL
15 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Classification
No ratings yet
Classification
4 pages
2 Supervised Learning
No ratings yet
2 Supervised Learning
52 pages
Lecture 4 - Intro To Machine Learning and Decision Trees
No ratings yet
Lecture 4 - Intro To Machine Learning and Decision Trees
61 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
ML 01
No ratings yet
ML 01
57 pages
SML Book Draft Latest
No ratings yet
SML Book Draft Latest
194 pages
Lect 1
No ratings yet
Lect 1
24 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Data Mining Evaluation Metrics Guide
No ratings yet
Data Mining Evaluation Metrics Guide
40 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
How to Evaluate Machine Learning Models
No ratings yet
How to Evaluate Machine Learning Models
14 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
Machine Learning Lab Guide
No ratings yet
Machine Learning Lab Guide
69 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Lecture 5 Evaluation - Classifer
No ratings yet
Lecture 5 Evaluation - Classifer
61 pages
Data Science Machine Learning
No ratings yet
Data Science Machine Learning
369 pages
Lecture 5 Classification P2 Decision Tree
No ratings yet
Lecture 5 Classification P2 Decision Tree
54 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
141 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Precision and Recall in ML Evaluation
No ratings yet
Precision and Recall in ML Evaluation
28 pages
Session 5
No ratings yet
Session 5
36 pages
BigData Week13
No ratings yet
BigData Week13
62 pages
CH 6
No ratings yet
CH 6
24 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
INT354 Syllabus
No ratings yet
INT354 Syllabus
2 pages
DM Unit-3
No ratings yet
DM Unit-3
23 pages
1 All Notes G
No ratings yet
1 All Notes G
217 pages
Learning
No ratings yet
Learning
51 pages
5 DL
No ratings yet
5 DL
33 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
Tutorial 3 Answers Part 1
No ratings yet
Tutorial 3 Answers Part 1
16 pages
Junior High Software World Poster Guide
No ratings yet
Junior High Software World Poster Guide
9 pages
Syllabus For Editing
No ratings yet
Syllabus For Editing
32 pages
K3 Ibkl I1 S KXJTU26 M OErc PGWH 1 V 2 DHK SML EETk 3 J
No ratings yet
K3 Ibkl I1 S KXJTU26 M OErc PGWH 1 V 2 DHK SML EETk 3 J
5 pages
Cosy 131 For Remote Access
No ratings yet
Cosy 131 For Remote Access
3 pages
Lesson on Input/Output Devices for Grade 7
No ratings yet
Lesson on Input/Output Devices for Grade 7
4 pages
Java Lab Record
No ratings yet
Java Lab Record
17 pages
Specification: C 21: Shenzhen Comen Medical Instrument Co.,Ltd
No ratings yet
Specification: C 21: Shenzhen Comen Medical Instrument Co.,Ltd
3 pages
BUSINESS INTELLIGENCE PYQs Answers
No ratings yet
BUSINESS INTELLIGENCE PYQs Answers
66 pages
FANUC Field System
No ratings yet
FANUC Field System
22 pages
Participant Information Sheet Lola
No ratings yet
Participant Information Sheet Lola
3 pages
Final Print Sree
No ratings yet
Final Print Sree
78 pages
GautamKarthik Gnanamanickam - SeniorEngineer
No ratings yet
GautamKarthik Gnanamanickam - SeniorEngineer
2 pages
FDS KGRL
No ratings yet
FDS KGRL
137 pages
Super Gaminator Ii (FV600)
100% (1)
Super Gaminator Ii (FV600)
184 pages
Manual GSS LPG 20 en
No ratings yet
Manual GSS LPG 20 en
46 pages
TruTrace-system EMG
No ratings yet
TruTrace-system EMG
2 pages
Epson Hoja de Muestra Genericps8x11
100% (1)
Epson Hoja de Muestra Genericps8x11
1 page
Si5351A VFO Kit for DSB Transceivers
No ratings yet
Si5351A VFO Kit for DSB Transceivers
2 pages
Sdoquezon Adm SHS12 C Mil M4
No ratings yet
Sdoquezon Adm SHS12 C Mil M4
20 pages
Digital Electronics by A.K. Tripathi
100% (1)
Digital Electronics by A.K. Tripathi
264 pages
Database Design Course Outline
No ratings yet
Database Design Course Outline
16 pages
Practical No06
No ratings yet
Practical No06
10 pages
Data Structure Q&A for C Programming
No ratings yet
Data Structure Q&A for C Programming
32 pages
CPS Claim Online Process
100% (2)
CPS Claim Online Process
13 pages
CCS334 Set1
No ratings yet
CCS334 Set1
3 pages
ICND120S04
No ratings yet
ICND120S04
86 pages
Mcafee Agent 5.7.x Product Guide
No ratings yet
Mcafee Agent 5.7.x Product Guide
75 pages
Worksheet 1
No ratings yet
Worksheet 1
9 pages
WebLogic Server Admin Resume
No ratings yet
WebLogic Server Admin Resume
4 pages

Lecture 7 - Feature Selection & Model Optimization

Uploaded by

Lecture 7 - Feature Selection & Model Optimization

Uploaded by

UET

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

INT3405 - Machine Learning

FIT-CS INT3405E - Machine Learning 3

C is a regularization parameter. Soft margin SVM trade off between

FIT-CS INT3405E - Machine Learning 4

• Regression: average squared error

FIT-CS INT3405E - Machine Learning 5

FIT-CS INT3405E - Machine Learning 6

Overfitting: If we have too many features (complicated predictor), the

FIT-CS INT3405E - Machine Learning 9

Higher degree => higher complexity

○ Decision Trees with depth k or with k leaves

Higher depth/ More # leaves => Higher complexity

○ KNN classifiers with varying neighbourhood sizes k =1,2,3…

FIT-CS INT3405E - Machine Learning 10

• Empirical Error Minimization over class

FIT-CS INT3405E - Machine Learning 11

Estimation error Approximation error

Estimation Excess risk

Estimation error Approximation error

Estimation Excess risk

FIT-CS INT3405E - Machine Learning 13

Notice that Optimal predictor

FIT-CS INT3405E - Machine Learning 14

FIT-CS INT3405E - Machine Learning 15

FIT-CS INT3405E - Machine Learning 16

Low Variance High Variance

FIT-CS INT3405E - Machine Learning 17

Low Variance High Variance

FIT-CS INT3405E - Machine Learning 18

FIT-CS INT3405E - Machine Learning 19

FIT-CS INT3405E - Machine Learning 20

FIT-CS INT3405E - Machine Learning 21

FIT-CS INT3405E - Machine Learning 22

FIT-CS INT3405E - Machine Learning 23

FIT-CS INT3405E - Machine Learning 24

FIT-CS INT3405E - Machine Learning 25

FIT-CS INT3405E - Machine Learning 27

FIT-CS INT3405E - Machine Learning 28

Idea: Gradually choose the most important features

FIT-CS INT3405E - Machine Learning 29

Cost of model / model complexity

FIT-CS INT3405E - Machine Learning 30

• Regularized Linear Regression

• “Lasso” (Least absolute shrinkage and selection operator)

FIT-CS INT3405E - Machine Learning 31

• -regularized Logistic Regression

• -regularized Logistic Regression (“Sparse Logistic

FIT-CS INT3405E - Machine Learning 32

• Typically improves the accuracy,

FIT-CS INT3405E - Machine Learning 33

FIT-CS INT3405E - Machine Learning 34

FIT-CS INT3405E - Machine Learning 35

FIT-CS INT3405E - Machine Learning 36

• Build classifier on each bootstrap sample

FIT-CS INT3405E - Machine Learning 37

FIT-CS INT3405E - Machine Learning 38

FIT-CS INT3405E - Machine Learning 39

• Inefficient boostrap sampling:

FIT-CS INT3405E - Machine Learning 40

• Better combination strategy

FIT-CS INT3405E - Machine Learning 41

FIT-CS INT3405E - Machine Learning 42

• Example 4 is hard to classify

FIT-CS INT3405E - Machine Learning 43

FIT-CS INT3405E - Machine Learning 44

FIT-CS INT3405E - Machine Learning 45

FIT-CS INT3405E - Machine Learning 46

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

You might also like