0% found this document useful (0 votes)

46 views38 pages

Intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views38 pages

Intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Classification:

A machine learning perspective

Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Part of a specialization

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

This course is a part of the
Machine Learning Specialization

1. Foundations

4. Clustering 5. Recommender
2. Regression 3. Classification
& Retrieval Systems

6. Capstone

3 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is the course about?

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xày
from data
relationship Predict y:
categorical “output”,
class or label
5 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Sentiment classifier
Input x: Easily best sushi in Seattle.

Sentence Sentiment
Classifier

Output: y
Sentiment

6 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Classifier

Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class

7 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Example multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage
8 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam

Spam

Input: x Output: y
Text of email,
9
sender, IP,… ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification

Input: x Output: y
Image pixels Predicted object
10 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia
…

11 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Reading your mind
Inputs x are
brain region Output y
intensities
“Hammer”

“House”
12 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Impact of classification

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Impact of classification

14 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course overview

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course philosophy: Always use case studies & …

Core
Visual Algorithm
concept

Advanced
Practical Implement
topics

I O N A L
OPT
16 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Overview of content

Models Algorithms Core ML

Linear Alleviating
Gradient
classifiers overfitting

Logistic Stochastic Handling

regression gradient missing data

Decision Recursive Precision-

trees greedy recall

Online
Ensembles Boosting
learning

17 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course outline

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Overview of modules

Models Algorithms Core ML

Alleviating
Linear classifiers Gradient
overfitting
Module 1 Modules 2 & 3
Modules 3 & 5

Handling missing
Logistic regression Stochastic gradient
data
Modules 1, 2, 3 Module 9
Module 6

Decision trees Recursive greedy Precision-recall

Modules 4 & 5 Module 4 Module 8

Ensembles Boosting Online learning

Module 7 Module 8 Module 9

19 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 1: Linear classifiers
Word Coeﬃcient
#awesome 1.0
#awful -1.5
Score(x) = 1.0 #awesome – 1.5 #awful
#awful

Score(x) < 0
…

0
Score(x) > 0
0 1 2 3 4 …
#awesome
20 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 1: Logistic regression represents probabilities
⌃
P(y=+1|x,ŵ) = 1 .

1 + e-ŵ h(x)
T

21 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 2: Learning “best” classifier
Maximize likelihood over all possible w0,w1,w2

ℓ(w0=0, w1=1, w2=-1.5) = 10-6

#awful

ℓ(w0=1, w1=1, w2=-1.5) = 10-5

… Best model with

4 gradient ascent:
3 Highest likelihood ℓ(w)
2 ŵ = (w0=1, w1=0.5, w2=-1.5)
1
ℓ(w0=1, w1=0.5, w2=-1.5) = 10-4
0
0 1 2 3 4 …
#awesome
23 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 3: Overfitting & regularization
True error
Classification
error

Training error

Model complexity

Use regularization penalty 2

to mitigate overfitting
ℓ(w)
(w) - λ||w||2
25 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 4: Decision trees
Start

excellent poor
Credit?

fair
Income?
Safe Term?
high Low
3 years 5 years

Risky Safe Term? Risky

3 years 5 years

Risky Safe

26 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Overfitting in decision trees
Decision Tree
Depth 1 Depth 3 Depth 10

Logistic Regression
Degree 1 features Degree 2 features Degree 6 features

27 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Alleviate overfitting by learning simpler trees
Occam’s Razor: “Among competing hypotheses,
the one with fewest assumptions should be
selected”, William of Occam, 13th Century

Complex Tree Simpler Tree

Simplify

Module 6: Handling missing data
Start

Credit Term Income y

excellent poor
excellent 3 yrs high safe Credit?

fair ? low risky fair

or unknown
fair 3 yrs high safe Income?
Safe Term?
poor 5 yrs high risky high Low
3 years 5 years or unknown
excellent 3 yrs low risky or unknown
fair 5 yrs high safe Risky Safe Term? Risky

poor ? high risky 3 years 5 years

or unknown
poor 5 yrs low safe
fair ? high safe Risky Safe

Module 7: Boosting question
“Can a set of weak learners be combined to
create a stronger learner?” Kearns and Valiant (1988)

Yes! Schapire (1990)

Boosting

Amazing impact: simple approach widely used in

industry wins most Kaggle competitions
32 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 7: Boosting using AdaBoost
Income>$100K? Credit history? Savings>$100K? Market conditions?

Yes No Bad Good Yes No Bad Good

Safe Risky Risky Safe Safe Risky Risky Safe

f1(xi) = +1 f2(xi) = -1 f3(xi) = -1 f4(xi) = +1

Ensemble: Combine votes from many simple

classifiers to learn complex classifiers

Module 8: Precision-recall
Goal: increase
# guests by 30%

Need an automated,
“authentic”
Reviews marketing campaign

Great quotes Spokespeople

“Easily best sushi in Seattle.”

Accuracy not most important metric

PRECISION RECALL
Did I (mistakenly) show a Did I not show a (great)
negative sentence??? positive sentence???
34 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 9: Scaling to huge datasets & online learning

4.8B webpages 500M Tweets/day 5B views/day

Stochastic gradient: tiny modification to gradient,

a lot faster, but annoying in practice
Avg. log likelihood

Gradient
Better

Assumed background

Courses 1 & 2 in this ML Specialization
• Course 1: Foundations
- Overview of ML case studies
- Black-box view of ML tasks
- Programming & data
manipulation skills

• Course 2: Regression
- Data representation (input, output, features)
- Linear regression model
- Basic ML concepts:
• ML algorithm
• Gradient descent
• Overfitting
• Validation set and cross-validation
• Bias-variance tradeoﬀ
• Regularization

Math background
• Basic calculus
- Concept of derivatives
• Basic vectors
• Basic functions
- Exponentiation ex
- Logarithm

Programming experience
• Basic Python used
- Can pick up along the way if
knowledge of other language

Reliance on GraphLab Create
• SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead
• Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create
• Net result:
- learn how to code methods in Python
40 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Computing needs
• Basic 64-bit desktop or laptop
• Access to internet
• Ability to:
- Install and run Python (and GraphLab Create)
- Store a few GB of data

Let’s get started!

What's Next For ML & You: Emily Fox & Carlos Guestrin
No ratings yet
What's Next For ML & You: Emily Fox & Carlos Guestrin
38 pages
Logistic Regression Learning Annotated
No ratings yet
Logistic Regression Learning Annotated
77 pages
1 - Intro
No ratings yet
1 - Intro
33 pages
Logistic Regression with MLE Optimization
No ratings yet
Logistic Regression with MLE Optimization
42 pages
Introduction to Machine Learning Course
No ratings yet
Introduction to Machine Learning Course
51 pages
Coursera Machine Learning Course Week 6 - Slides
No ratings yet
Coursera Machine Learning Course Week 6 - Slides
44 pages
Regression:: Emily Fox & Carlos Guestrin
No ratings yet
Regression:: Emily Fox & Carlos Guestrin
30 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Topic 1 - Introduction
No ratings yet
Topic 1 - Introduction
30 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Lec 2
No ratings yet
Lec 2
43 pages
Lecture 1
100% (1)
Lecture 1
43 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
270 pages
Introduction 1
No ratings yet
Introduction 1
142 pages
Machine Learning Course Collection Overview
100% (2)
Machine Learning Course Collection Overview
6 pages
ML Merged
No ratings yet
ML Merged
433 pages
Applied ML Course Overview
No ratings yet
Applied ML Course Overview
66 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
ELE-COI-521 Machine Learning Topics
No ratings yet
ELE-COI-521 Machine Learning Topics
40 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Machine Learning Course Overview and Objectives
No ratings yet
Machine Learning Course Overview and Objectives
106 pages
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
No ratings yet
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
44 pages
MATH 370: Intro to Machine Learning
No ratings yet
MATH 370: Intro to Machine Learning
60 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
EE353 - 769 00 Course Introduction
No ratings yet
EE353 - 769 00 Course Introduction
28 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Classification Annotated
No ratings yet
Classification Annotated
50 pages
State of The Art Research Methodology For Machine
No ratings yet
State of The Art Research Methodology For Machine
58 pages
Introduction To Machine Learning: Agenda
No ratings yet
Introduction To Machine Learning: Agenda
13 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
L2 What Is ML
No ratings yet
L2 What Is ML
38 pages
Lecture1 - Introduction To Machine Learning
No ratings yet
Lecture1 - Introduction To Machine Learning
39 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
22 pages
01 ML Basics
No ratings yet
01 ML Basics
61 pages
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
59 pages
Machine Learning Online Course Certificate
No ratings yet
Machine Learning Online Course Certificate
12 pages
01 Introduction
No ratings yet
01 Introduction
23 pages
Unit 1
No ratings yet
Unit 1
62 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
25 pages
Lecture 2 - What Is ML
No ratings yet
Lecture 2 - What Is ML
17 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
Norvig Google ESTF2019
No ratings yet
Norvig Google ESTF2019
71 pages
Ai Full Stack
No ratings yet
Ai Full Stack
15 pages
Machine Learning - MT 2016: Varun Kanade
No ratings yet
Machine Learning - MT 2016: Varun Kanade
50 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
10 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Air Quality Prediction Using Machine Learning
No ratings yet
Air Quality Prediction Using Machine Learning
29 pages
Lecture 1
100% (1)
Lecture 1
51 pages
Your Roadmap To Becoming A World Class AI Generalist
No ratings yet
Your Roadmap To Becoming A World Class AI Generalist
10 pages
Advanced Machine Learning (AML)
No ratings yet
Advanced Machine Learning (AML)
70 pages
ML Cahp 1
No ratings yet
ML Cahp 1
35 pages
Master Machine Learning in 30 Days
No ratings yet
Master Machine Learning in 30 Days
25 pages
An Advanced Approach For Detecting Behavior-Based Intranet Attacks by Machine Learning
No ratings yet
An Advanced Approach For Detecting Behavior-Based Intranet Attacks by Machine Learning
16 pages
My Final File
No ratings yet
My Final File
54 pages
Forecasting India's CPI: XGBoost vs LSTM
No ratings yet
Forecasting India's CPI: XGBoost vs LSTM
8 pages
Al3451 ML - Questionbank - 3,4,5
No ratings yet
Al3451 ML - Questionbank - 3,4,5
11 pages
Devshree 20BCP112 LAb4
No ratings yet
Devshree 20BCP112 LAb4
12 pages
Module 3 - Chapter 6 - Decision Tree Learning
No ratings yet
Module 3 - Chapter 6 - Decision Tree Learning
150 pages
Prediction of House Prices Using Machine Learning
No ratings yet
Prediction of House Prices Using Machine Learning
8 pages
Optimizing e Commerce
No ratings yet
Optimizing e Commerce
2 pages
AIML 2marks and Important Questions
No ratings yet
AIML 2marks and Important Questions
23 pages
Machine Learning Models for Drug Solubility
No ratings yet
Machine Learning Models for Drug Solubility
11 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Camera Ready Paper-Anushree
No ratings yet
Camera Ready Paper-Anushree
12 pages
Dictionary of Artificial Intelligence
No ratings yet
Dictionary of Artificial Intelligence
492 pages
Phishing URL Detection via Login Analysis
No ratings yet
Phishing URL Detection via Login Analysis
12 pages
Machine Learning Course Guide
100% (1)
Machine Learning Course Guide
68 pages
Al3451 ML
No ratings yet
Al3451 ML
6 pages
Customer Churn Prediction in ECommerce Sector
No ratings yet
Customer Churn Prediction in ECommerce Sector
40 pages
Phishing Detection Using Machine Learning
No ratings yet
Phishing Detection Using Machine Learning
9 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Revolutionizing+Workout+Analytics Machine+Learning+Models+for+Calorie+Burn+Estimation Ajmedtech v4n2 Pg+33-45
No ratings yet
Revolutionizing+Workout+Analytics Machine+Learning+Models+for+Calorie+Burn+Estimation Ajmedtech v4n2 Pg+33-45
13 pages
Telecom Churn Prediction Model
No ratings yet
Telecom Churn Prediction Model
22 pages
Ensemble Learning for Data Scientists
No ratings yet
Ensemble Learning for Data Scientists
20 pages
Decision Trees and Random Forest and Boosting
No ratings yet
Decision Trees and Random Forest and Boosting
12 pages
Gaussian Mixture Model Parameters
No ratings yet
Gaussian Mixture Model Parameters
24 pages
MIT 6.867 Machine Learning Exam Problems
No ratings yet
MIT 6.867 Machine Learning Exam Problems
10 pages
AI-Driven COVID-19 Mortality Prediction
No ratings yet
AI-Driven COVID-19 Mortality Prediction
46 pages
Geology Prediction via ML Techniques
No ratings yet
Geology Prediction via ML Techniques
13 pages
4-1 Syllabus of Jntuh Syllabus of 4th Year
No ratings yet
4-1 Syllabus of Jntuh Syllabus of 4th Year
5 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Machine Learning Techniques For Prediction of Mental Health
No ratings yet
Machine Learning Techniques For Prediction of Mental Health
8 pages