100% found this document useful (1 vote)

212 views84 pages

13 PracticalMachineLearning

This document provides an overview of machine learning concepts including supervised and unsupervised learning, features, kernels, support vector machines, decision trees, and more. It discusses key machine learning algorithms like SVM, decision trees, k-means clustering, and their applications. Tips are provided on practical machine learning tasks like parameter tuning and multi-class classification. Comparisons are made between different algorithms to highlight their strengths and weaknesses.

Uploaded by

Matheus Silva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

212 views84 pages

13 PracticalMachineLearning

Uploaded by

Matheus Silva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Prac%cal

Machine Learning
Verena Kaynig-Fi4kau (vkaynig@[Link])

We are drowning in informa%on and

starving for knowledge
John Naisbi4
Machine Learning
Analyze training data

Make predic%ons for new unseen data:

supervised learning

Find pa4erns:
unsupervised learning

Machine Learning
Supervised Learning
SVM, Decision Tree, Boos%ng, Random Forest

Unsupervised Learning
K-means, mean shiS
Supervised Learning

data points
x2
labels

features

separa%ng
hyper plane
x1
Features are important

?
roundness

weight
Features are important

shape

color
Googles Self-Driving Car
Car Features
Laser scan Intensity model Eleva%on model

Lane model
Camera vision 2D sta%onary map
So just measure everything?
More features = be4er classica%on?

Prac%cal issues:
Data volume, computa%on overhead

Theore%cal issues:
Generaliza%on performance
Curse of dimensionality
Supervised Learning

data points
x2
labels

features

separa%ng
hyper plane
x1
Perceptron
x: data point x1 w1

y: label x2
w2
w3
w: weight vector x3 b
b: bias -1

-1 w
The XOR Problem

x3
x2

x1
Support Vector Machine
Widely used for all sorts of classica%on
problems
[Link]/isabelle/Projects/SVM/[Link]

Some people say it is the best of the shelf

classier out there
Maximum Margin Classica%on

x2 x2

x1 x1
What about outliers?

: slack variables

x1
XOR problem revised

x=0

Did we add informa%on to make the problem seperable?

Polynomial Kernel in 3D
Quadra%c Kernel
Kernel Func%ons

Polynomial:

Radial basis func%on (RBF):
Kernel Trick for SVMs
Arbitrary many dimensions
Li4le computa%onal cost
Maximal margin helps with curse of
dimensionality

SVM Applet

h4p://[Link]/educa%on/
lectures_and_seminars/annex_estat/Classier/
[Link]
Tips and Tricks
SVMs are not scale invariant
Check if your library normalizes by default
Normalize your data
mean: 0 , std: 1
map to [0,1] or [-1,1]
Normalize test set in same way!

Tips and Tricks
RBF kernel is a good default
For parameters try exponen%al sequences
Read:
Chih-Wei Hsu et al., A Prac%cal Guide to
Support Vector Classica%on,
Bioinforma%cs (2010)
Parameter Tuning
Given a classica%on task

Which kernel ?
Which kernel parameter values?
Which value for C?

Try dierent combina%ons
and take the best.
Grid Search

Zang et al., Iden%ca%on of heparin samples that contain impuri%es

or contaminants by chemometric pa4ern recogni%on analysis
of proton NMR spectral data, Anal Bioanal Chem (2011)
Mul% Class

One vs. All

One vs All
Train n classier for n classes
Take classica%on with greatest posi%ve
margin
Slow training
Mul% Class

One vs. One

One vs One
Train n(n-1)/2 classiers
Take majority vote
Fast training
Decision Tree
aSer 10
no pm? yes

got
call friend
electricity?

got new
read book
dvd?

play
watch tv
computer
Decision Trees
Fast training
Fast prediciton
Easy to understand
Easy to interpret
Decision Tree - Idea

C D

A
A B C D E

Bishop, Pa4ern Recogni%on and Machine Learning, Springer, 2006
Decision Tree - Predic%on

C D

A
A B C D E
Decision Tree -Training
Learn the tree structure:
which feature to query
which threshold to choose

A B C D E
Node Purity

10 7
E

3 5 7 2 B

3 2
C D

A
A B C D E
When to Stop
node contains only one class
node contains less than x data points
max depth is reached
node purity is sucient
you start to overt => cross-valida%on
Decision Trees - Disadvantages
Sensi%ve to small changes in the data
Overtng
Only axis aligned splits
Decision Trees vs SVM

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and Predic%on, Springer (2009)
Wisdom of Crowds
The collec%ve knowledge of a diverse and
independent body of people typically exceeds
the knowledge of any single individual, and can
be harnessed by vo%ng.
James Surowiecki

h4p://[Link]/
Ensemble Methods
A single decision tree does not perform well
But, it is super fast
What if we learn mul%ple trees?

For mul%ple trees we need

even more data!
Bootstrap
Resampling method from sta%s%cs
Useful to get error bars on es%mates

Take N data points
Draw N %mes with replacement

Get es%mate from each bootstrapped sample

Bagging
Bootstrap aggregating

Sample with replacement from your data set

Learn a classier for each bootstrap sample
Average the results
Bagging Example

x1
Bagging
Reduces overtng (variance)
Normally uses one type of classier
Decision trees are popular
Easy to parallelize
Boos%ng
Also ensemble method like Bagging
But:
weak learners evolve over %me
votes are weighted

Be4er than Bagging for many applica%ons

Very popular method

Boos%ng
Boos%ng is one of the most powerful learning
ideas introduced in the last twenty years.

Has%e et al.,The Elements of Sta%s%cal Learning: Data
Mining, Inference, and Predic%on, Springer (2009)
Adaboost

x1
AdaBoost
Ini%alize weights for data points
For each itera%on:
Fit classier to training data
Compute weighted classica%on error
Compute weight for classier from the error
Update weights for data points
Final classier is weighted sum of all single
classiers
AdaBoost

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and Predic%on, Springer (2009)
AdaBoost
AdaBoost
Introduced by Freund and Schapire in 1995
Worked great, nobody understood why!

Then ve years later (Friedman et al. 2000):

Adaboost minimizes exponen%al loss func%on.
There s%ll are open ques%ons.
Random Forest
Builds upon the idea of bagging
Each tree build from bootstrap sample
Node splits calculated from random feature
subsets

h4p://[Link]%[Link]/ar%cles/about/fun
Random Forest
All trees are fully grown
No pruning

Two parameters
Number of trees
Number of features

Random Forest Error Rate
Error depends on:
Correla%on between trees (higher is worse)
Strength of single trees (higher is be4er)

Increasing number of features for each split:

Increases correla%on
Increases strength of single trees
Out of Bag Error
Each tree is trained on a bootstrapped sample
About 1/3 of data points not used for training

Predict unseen points with each tree

Measure error
Out of Bag Error
data points
sample lter

bootstrap unused
sample data points

train test
Out of Bag Error
Very similar to cross-valida%on
Measured during training
Can be too op%mis%c
Variable Importance
Again use out of bag samples
Predict class for these samples
Randomly permute values of one feature
Predict classes again
Measure decrease in accuracy
Temp%ng Scenario
Run random forest with all features
Reduce number of features based on
importance weights
Run again with reduced feature set and report
out of bag error

This does not measure test

performance!
Unbalanced Classes
The Problem:

Oversample:

Subsample:

Subsample for each tree!

Random Forest Subsampling

sample

train
Random Forest
Similar to Bagging
Easy to parallelize
Packaged with some neat func%ons:
Out of bag error
Feature importance measure
Proximity es%ma%on
Cascade Classier
Ensemble methods are strong
But predic%on is slow
Solu%on: Make predic%on faster

Idea: Build a cascade
Cascade Classier

h4p://[Link]/wiki/Viola%E2%80%93Jones_object_detec%on_framework
Viola Jones Face Detec%on

h4p://[Link]/
Viola Jones Face Detec%on
Takes long to train
Predic%on in real %me!

Widely used today

Summary
SVMs
Decision Trees
Bootstrap, Bagging, Boos%ng
Random Forest
Cascade Classier
Further Reading
Error Measures
predicted
True posi%ve (tp)
1 -1
True nega%ve (tn)
1
False posi%ve (fp) tp fn

true
False nega%ve (fn)
-1 fp tn
TPR and FPR
predicted
True Posi%ve Rate:
1 -1

1
tp fn

true

-1 fp tn
False Posi%ve Rate:
Precision Recall
predicted
1 -1
Recall:
1
tp fn

true

-1 fp tn
Precision:
Precision Recall Curve
1
precision

1
recall
Comparison

J. Davis & M. Goadrich,

The Rela%onship Between Precision-Recall and ROC Curves.,
ICML (2006)
F-measure
Weighted average of precision and recall

Usual case:
Increasing allocates weight to recall
Clustering Evalua%on Criteria
Based on expert knowledge
Debatable for real data
Hidden Unknown structures could be present
Do we even want to just reproduce known
structure?
Rand Index
Percentage of correct classica%ons
Compare pairs of elements:

tn
tp

fn fp

Fp and fn are equally weighted

Stability
Stability
What is the right number of clusters?
What makes a good clustering solu%on?

Clustering should generalize!

Stability
Gini Impurity
Example:
4 red, 3 green, 3 blue data points

random sample:
red: 4/10 green: 3/10 blue: 3/10

misclassica%on:
red: 4/10 * (3/10 + 3/10)
true wrong
class predic%on
Gini Impurity
Number of classes:
Number of data points:
Number of data points of class i:

true wrong
class predic%on
Gini Impurity

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and

Predic%on, Springer (2009)
Node Purity Gain
Compare:
A
Gini impurity of parent node
Gini impurity of child nodes B C
Pseudocode
Check for base cases
For each a4ribute a
Calculate the gain from splitng on a
Let a_best be the a4ribute with highest gain
Create a decision node that splits on a_best
Repeat on the sub-nodes

h4p://[Link]/wiki/C4.5_algorithm

Tree-Based Machine Learning Methods
100% (1)
Tree-Based Machine Learning Methods
138 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Machine Learning Lab Assignments
100% (2)
Machine Learning Lab Assignments
23 pages
ML Project Shivani Pandey
100% (2)
ML Project Shivani Pandey
49 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Regression Diagnostics Overview
100% (1)
Regression Diagnostics Overview
53 pages
Stats & ML Model Comparisons
100% (1)
Stats & ML Model Comparisons
72 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Key Scikit-Learn Models and Hyperparameters
No ratings yet
Key Scikit-Learn Models and Hyperparameters
1 page
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
Supervised Learning: Regression Basics
No ratings yet
Supervised Learning: Regression Basics
97 pages
Machine Learning Overview by C. Vinoth Kumar
100% (1)
Machine Learning Overview by C. Vinoth Kumar
15 pages
Feature Selection in Python ML
No ratings yet
Feature Selection in Python ML
7 pages
Lecture 14 - Logistic and Softmax Regression - Plain
No ratings yet
Lecture 14 - Logistic and Softmax Regression - Plain
12 pages
Feature Engineering Guide
100% (2)
Feature Engineering Guide
44 pages
Combined ML
100% (1)
Combined ML
705 pages
AdaBoost Classifier Tutorial Python
100% (1)
AdaBoost Classifier Tutorial Python
9 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Eda PDF
100% (1)
Eda PDF
45 pages
Machine Learning Basics for Beginners
100% (2)
Machine Learning Basics for Beginners
139 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Data Science Experiment Guide
100% (2)
Data Science Experiment Guide
43 pages
Dimensionality Reduction Explained
No ratings yet
Dimensionality Reduction Explained
60 pages
Top 9 Data Science Algorithms
No ratings yet
Top 9 Data Science Algorithms
152 pages
Bias Varience Trade Off
100% (2)
Bias Varience Trade Off
35 pages
Understanding Cluster Analysis in Data Mining
100% (1)
Understanding Cluster Analysis in Data Mining
60 pages
Supervised Regression in Machine Learning
No ratings yet
Supervised Regression in Machine Learning
32 pages
One-Hot Encoding for Categorical Data
No ratings yet
One-Hot Encoding for Categorical Data
4 pages
Understanding Decision Trees in Classification
100% (1)
Understanding Decision Trees in Classification
58 pages
Supervised, Unsupervised & Reinforcement Learning
No ratings yet
Supervised, Unsupervised & Reinforcement Learning
11 pages
Machine Learning Exam Questions and Answers
No ratings yet
Machine Learning Exam Questions and Answers
16 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
25 pages
ML Lab File
No ratings yet
ML Lab File
53 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
AAL Programs
No ratings yet
AAL Programs
12 pages
Text
No ratings yet
Text
131 pages
Machine Learning in Mechanical Engineering
No ratings yet
Machine Learning in Mechanical Engineering
20 pages
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
100% (1)
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
33 pages
Machine Learning Loss Functions Guide
100% (2)
Machine Learning Loss Functions Guide
37 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
32 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Bagging and Random Forest Presentation1
100% (4)
Bagging and Random Forest Presentation1
23 pages
Data Science Interview Questions Guide
100% (1)
Data Science Interview Questions Guide
16 pages
Deep Learning Quiz: Week 1 & 2
No ratings yet
Deep Learning Quiz: Week 1 & 2
5 pages
Week 1 Deep Learning Quiz Insights
No ratings yet
Week 1 Deep Learning Quiz Insights
2 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
Machine Learning Data Preparation Guide
No ratings yet
Machine Learning Data Preparation Guide
49 pages
Neural Networks for CS Students
100% (1)
Neural Networks for CS Students
22 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Transformers in NLP: An Overview
No ratings yet
Transformers in NLP: An Overview
9 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
CS771: Intro to Machine Learning
No ratings yet
CS771: Intro to Machine Learning
25 pages
ML Mod1
No ratings yet
ML Mod1
48 pages
Unit 3
No ratings yet
Unit 3
63 pages
Amlt Bca Unit-1
No ratings yet
Amlt Bca Unit-1
24 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Preprocessing of MRI Data For Alzheimer Diseases Diagnosis: July 2018
No ratings yet
Preprocessing of MRI Data For Alzheimer Diseases Diagnosis: July 2018
4 pages
DeepAD SubjectLevel Ready2submit Final
No ratings yet
DeepAD SubjectLevel Ready2submit Final
33 pages
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
No ratings yet
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
23 pages
04 DataMunging PDF
No ratings yet
04 DataMunging PDF
36 pages
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
No ratings yet
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
23 pages
Structure and Dynamics of Functional Networks in Child-Onset - Guilherme Ferraz de Arruda and Francisco A. Rodrigues
No ratings yet
Structure and Dynamics of Functional Networks in Child-Onset - Guilherme Ferraz de Arruda and Francisco A. Rodrigues
7 pages
Credit Risk Analysis with Machine Learning
No ratings yet
Credit Risk Analysis with Machine Learning
19 pages
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
28 pages
CS109 Data Science: Trees & Databases
No ratings yet
CS109 Data Science: Trees & Databases
80 pages
My Portion : Written by Mark Barlow. Original Key DB Major
No ratings yet
My Portion : Written by Mark Barlow. Original Key DB Major
2 pages
Network Models II: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Network Models II: CS109/Stat121/AC209/E-109 Data Science
19 pages
19 Storytelling PDF
No ratings yet
19 Storytelling PDF
64 pages
14 MapReduce PDF
100% (1)
14 MapReduce PDF
82 pages
Bias and Sampling: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Bias and Sampling: CS109/Stat121/AC209/E-109 Data Science
17 pages
Data Science Course Overview
No ratings yet
Data Science Course Overview
74 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Network Models in Data Science
No ratings yet
Network Models in Data Science
20 pages
04 DataMunging PDF
No ratings yet
04 DataMunging PDF
36 pages
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
28 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Harvard CS109 Network Models Overview
No ratings yet
Harvard CS109 Network Models Overview
20 pages
Docker For Beginners Mumshad Mannambeth
100% (1)
Docker For Beginners Mumshad Mannambeth
138 pages
Anti-Phishing for Tech Users
No ratings yet
Anti-Phishing for Tech Users
6 pages
354 39 Solutions Instructor Manual 11 Hardware Features 8051 Chapter 11
No ratings yet
354 39 Solutions Instructor Manual 11 Hardware Features 8051 Chapter 11
8 pages
QP Nov - 2011 3 - 1 Ecm Mpi
No ratings yet
QP Nov - 2011 3 - 1 Ecm Mpi
4 pages
VMFS Block Reclamation Guide
No ratings yet
VMFS Block Reclamation Guide
4 pages
Tessent
No ratings yet
Tessent
2 pages
Mealy and Moore Machines
No ratings yet
Mealy and Moore Machines
15 pages
Breaking GSM A5/1 with FPGA Rainbow Tables
No ratings yet
Breaking GSM A5/1 with FPGA Rainbow Tables
7 pages
PS - 4.5 - Hands-On - Training (2007.02)
No ratings yet
PS - 4.5 - Hands-On - Training (2007.02)
33 pages
VHDL Cheat Sheet
No ratings yet
VHDL Cheat Sheet
2 pages
Computer Science Aqa Specification Checklist
No ratings yet
Computer Science Aqa Specification Checklist
17 pages
School of Information Technology and Engineering: Declaration by The Candidate
No ratings yet
School of Information Technology and Engineering: Declaration by The Candidate
4 pages
JavaScript JSON Cookbook - Sample Chapter
100% (3)
JavaScript JSON Cookbook - Sample Chapter
24 pages
Study On Cloud Security in Japan
No ratings yet
Study On Cloud Security in Japan
33 pages
Intro Embedded Systems
100% (3)
Intro Embedded Systems
294 pages
Department of Engineering Science
No ratings yet
Department of Engineering Science
4 pages
1.3 Operating Systems
No ratings yet
1.3 Operating Systems
10 pages
Air Force Reserve UTAPS Guide
0% (1)
Air Force Reserve UTAPS Guide
43 pages
Firewall Basics for IT Professionals
No ratings yet
Firewall Basics for IT Professionals
76 pages
Linking ABAQUS 2017 with Intel Fortran
100% (1)
Linking ABAQUS 2017 with Intel Fortran
4 pages
Amadeus Queues Manual - v1
No ratings yet
Amadeus Queues Manual - v1
0 pages
C# XML Parsing Techniques
No ratings yet
C# XML Parsing Techniques
11 pages
SpringFundamentals Day2
No ratings yet
SpringFundamentals Day2
4 pages
FactoryTalk Batch View HMI Controls - Quick Start
No ratings yet
FactoryTalk Batch View HMI Controls - Quick Start
28 pages
Milk Society Management Solution
No ratings yet
Milk Society Management Solution
11 pages
Class Diagram Guide for Rational Rose
No ratings yet
Class Diagram Guide for Rational Rose
4 pages
Question Bank: Unit Questions/ Tutorials/Quiz Questions CO Marks
No ratings yet
Question Bank: Unit Questions/ Tutorials/Quiz Questions CO Marks
2 pages
Motorola Questions Latest
No ratings yet
Motorola Questions Latest
38 pages
Outwppaysb
No ratings yet
Outwppaysb
3 pages
Exam14 Solutions
No ratings yet
Exam14 Solutions
26 pages

13 PracticalMachineLearning

Uploaded by

13 PracticalMachineLearning

Uploaded by

Prac%cal

We are drowning in informa%on and

Make predic%ons for new unseen data:

Some people say it is the best of the shelf

Did we add informa%on to make the problem seperable?

Zang et al., Iden%ca%on of heparin samples that contain impuri%es

One vs. All

One vs. One

For mul%ple trees we need

Get es%mate from each bootstrapped sample

Sample with replacement from your data set

Be4er than Bagging for many applica%ons

Very popular method

Then ve years later (Friedman et al. 2000):

Increasing number of features for each split:

Predict unseen points with each tree

This does not measure test

Subsample for each tree!

Widely used today

J. Davis & M. Goadrich,

Fp and fn are equally weighted

Clustering should generalize!

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and

You might also like