0% found this document useful (0 votes)
46 views38 pages

Intro

Uploaded by

GregMG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views38 pages

Intro

Uploaded by

GregMG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Classification:

A machine learning perspective


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Part of a specialization

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


This course is a part of the
Machine Learning Specialization

1. Foundations

4. Clustering 5. Recommender
2. Regression 3. Classification
& Retrieval Systems

6. Capstone

3 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is the course about?

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xày
from data
relationship Predict y:
categorical “output”,
class or label
5 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Sentiment classifier
Input x: Easily best sushi in Seattle.

Sentence Sentiment
Classifier

Output: y
Sentiment

6 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Classifier

Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class

7 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Example multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage
8 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam

Spam

Input: x Output: y
Text of email,
9
sender, IP,… ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification

Input: x Output: y
Image pixels Predicted object
10 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia

11 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Reading your mind
Inputs x are
brain region Output y
intensities
“Hammer”

“House”
12 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Impact of classification

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Impact of classification

14 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course overview

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course philosophy: Always use case studies & …

Core
Visual Algorithm
concept

Advanced
Practical Implement
topics

I O N A L
OPT
16 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Overview of content

Models Algorithms Core ML


Linear Alleviating
Gradient
classifiers overfitting

Logistic Stochastic Handling


regression gradient missing data

Decision Recursive Precision-


trees greedy recall

Online
Ensembles Boosting
learning

17 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course outline

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Overview of modules

Models Algorithms Core ML


Alleviating
Linear classifiers Gradient
overfitting
Module 1 Modules 2 & 3
Modules 3 & 5

Handling missing
Logistic regression Stochastic gradient
data
Modules 1, 2, 3 Module 9
Module 6

Decision trees Recursive greedy Precision-recall


Modules 4 & 5 Module 4 Module 8

Ensembles Boosting Online learning


Module 7 Module 8 Module 9

19 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 1: Linear classifiers
Word Coefficient
#awesome 1.0
#awful -1.5
Score(x) = 1.0 #awesome – 1.5 #awful
#awful

Score(x) < 0

0
Score(x) > 0
0 1 2 3 4 …
#awesome
20 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 1: Logistic regression represents probabilities

P(y=+1|x,ŵ) = 1 .

1 + e-ŵ h(x)
T

21 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 2: Learning “best” classifier
Maximize likelihood over all possible w0,w1,w2

ℓ(w0=0, w1=1, w2=-1.5) = 10-6


#awful

ℓ(w0=1, w1=1, w2=-1.5) = 10-5

… Best model with


4 gradient ascent:
3 Highest likelihood ℓ(w)
2 ŵ = (w0=1, w1=0.5, w2=-1.5)
1
ℓ(w0=1, w1=0.5, w2=-1.5) = 10-4
0
0 1 2 3 4 …
#awesome
23 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 3: Overfitting & regularization
True error
Classification
error

Training error

Model complexity

Use regularization penalty 2


to mitigate overfitting
ℓ(w)
(w) - λ||w||2
25 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 4: Decision trees
Start

excellent poor
Credit?

fair
Income?
Safe Term?
high Low
3 years 5 years

Risky Safe Term? Risky

3 years 5 years

Risky Safe

26 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 5: Overfitting in decision trees
Decision Tree
Depth 1 Depth 3 Depth 10

Logistic Regression
Degree 1 features Degree 2 features Degree 6 features

27 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 5: Alleviate overfitting by learning simpler trees
Occam’s Razor: “Among competing hypotheses,
the one with fewest assumptions should be
selected”, William of Occam, 13th Century

Complex Tree Simpler Tree

Simplify

28 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 6: Handling missing data
Start

Credit Term Income y


excellent poor
excellent 3 yrs high safe Credit?

fair ? low risky fair


or unknown
fair 3 yrs high safe Income?
Safe Term?
poor 5 yrs high risky high Low
3 years 5 years or unknown
excellent 3 yrs low risky or unknown
fair 5 yrs high safe Risky Safe Term? Risky

poor ? high risky 3 years 5 years


or unknown
poor 5 yrs low safe
fair ? high safe Risky Safe

30 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 7: Boosting question
“Can a set of weak learners be combined to
create a stronger learner?” Kearns and Valiant (1988)

Yes! Schapire (1990)

Boosting

Amazing impact: Ÿ simple approach Ÿ widely used in


industry Ÿ wins most Kaggle competitions
32 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 7: Boosting using AdaBoost
Income>$100K? Credit history? Savings>$100K? Market conditions?

Yes No Bad Good Yes No Bad Good


Safe Risky Risky Safe Safe Risky Risky Safe

f1(xi) = +1 f2(xi) = -1 f3(xi) = -1 f4(xi) = +1

Ensemble: Combine votes from many simple


classifiers to learn complex classifiers

33 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 8: Precision-recall
Goal: increase
# guests by 30%

Need an automated,
“authentic”
Reviews marketing campaign

Great quotes Spokespeople


“Easily best sushi in Seattle.”

Accuracy not most important metric

PRECISION RECALL
Did I (mistakenly) show a Did I not show a (great)
negative sentence??? positive sentence???
34 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 9: Scaling to huge datasets & online learning

4.8B webpages 500M Tweets/day 5B views/day

Stochastic gradient: tiny modification to gradient,


a lot faster, but annoying in practice
Avg. log likelihood

Gradient
Better

35 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Assumed background

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Courses 1 & 2 in this ML Specialization
• Course 1: Foundations
- Overview of ML case studies
- Black-box view of ML tasks
- Programming & data
manipulation skills

• Course 2: Regression
- Data representation (input, output, features)
- Linear regression model
- Basic ML concepts:
• ML algorithm
• Gradient descent
• Overfitting
• Validation set and cross-validation
• Bias-variance tradeoff
• Regularization

37 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Math background
• Basic calculus
- Concept of derivatives
• Basic vectors
• Basic functions
- Exponentiation ex
- Logarithm

38 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Programming experience
• Basic Python used
- Can pick up along the way if
knowledge of other language

39 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Reliance on GraphLab Create
• SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead
• Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create
• Net result:
- learn how to code methods in Python
40 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Computing needs
• Basic 64-bit desktop or laptop
• Access to internet
• Ability to:
- Install and run Python (and GraphLab Create)
- Store a few GB of data

41 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Let’s get started!

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

You might also like