10-315
Machine Learning
Problem Formulation
Instructor: Pat Virtue
Today
Autoencoder (Aliens) (previous slides)
▪ Features
ML Problem Formulation
▪ Task input and output
▪ Task, Performance, Experience
▪ Data and notation
▪ Examples: Iris Classification and Car Price Regression
ML Training and Models
▪ Linear
▪ Memorization
▪ Nearest Neighbor
ML Problem Formulation
Agents
An agent is an entity that perceives
and acts.
Actions can have an effect on the
environment.
The specific sensors and actuators
affect what the agent is capable
of perceiving and what actions it
is capable of taking
Environment
Sensors
Percepts
Agent
?
Actuators
Actions
Slide credit: [Link]
Agent: Simple Input/Output Task
Agent
Predicted
Input ? Output
Task Input and Output
Input Task Output
Petal measurements Iris classification Category
Time of day Traffic prediction Traffic Volume
Image Image classification Category
Image Image denoising Image
Text Text to image generation Image
??? Face generation Image
Task: Face Generation
[Link]
Machine Learning Problem Formulation
Three components <T,P,E>:
1. Task, T
2. Performance measure, P
3. Experience, E
Definition of learning:
A computer program learns if its performance at tasks in T, as
measured by P, improves with experience E
8
Definition from (Mitchell, 1997)
Machine Learning Problem Formulation
Notation
Task
Formalize the task as a mapping from input to output ℎ 𝑥 → 𝑦ො
Experience
𝑁
Data! Task experience examples will usually be pairs: 𝒟= 𝑥 (𝑖) , 𝑦 (𝑖) 𝑖=1
(input, measured output)
Performance measure
Objective function that gives a single numerical value 𝑁
representing how well the system performs for a given 1 𝑖 𝑖
𝕝 𝑦 ≠ 𝑦ො
dataset 𝑁
𝑖=1
▪ Classification: error rate
𝑁
▪ Regression: mean squared error 1 𝑖 𝑖 2
𝑦 − 𝑦ො
Slide: CMU ML, Tom Mitchel and Roni Rosenfeld
𝑁
𝑖=1
Notation alert: Indicator function
ML Problem Formulation 𝕝 𝑧 = 𝟏(𝑧) = ቊ
1 if 𝑧 is true
0 otherwise
Task
Formalize the task as a mapping from input to output ℎ 𝑥 → 𝑦ො
Experience
𝑁
Data! Task experience examples will usually be pairs: 𝒟= 𝑥 (𝑖) , 𝑦 (𝑖) 𝑖=1
(input, measured output)
Performance measure
Objective function that gives a single numerical value 𝑁
representing how well the system performs for a given 1 𝑖 𝑖
𝕝 𝑦 ≠ 𝑦ො
dataset 𝑁
𝑖=1
▪ Classification: error rate
𝑁
▪ Regression: mean squared error 1 𝑖 𝑖 2
𝑦 − 𝑦ො
Slide: CMU ML, Tom Mitchel and Roni Rosenfeld
𝑁
𝑖=1
Experience: Data and Notation
Example Dataset: Fisher Iris Dataset
Fisher (1936) used 150 measurements of flowers from
3 different species: Iris setosa (0), Iris virginica (1), Iris
versicolor (2) collected by Anderson (1936)
Sepal Sepal Petal Petal
Species
Length Width Length Width
0 4.3 3.0 1.1 0.1
0 4.9 3.6 1.4 0.1
0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
1 5.7 2.8 4.1 1.3
1 6.3 3.3 4.7 1.6
2 5.9 3.0 5.1 1.8
Images and full dataset: [Link]
from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]
Dataset notation
𝑖 𝑖 𝑁 Species
Sepal Sepal Petal Petal
𝒟= 𝑦 ,𝐱 𝑖=1
Length Width Length Width
𝑁 0 4.3 3.0 1.1 0.1
𝑖 𝑖 𝑖 𝑖 𝑖
= 𝑦 , 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 0 4.9 3.6 1.4 0.1
𝑖=1 0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
Linear algebra can represent all data
1 5.7 2.8 4.1 1.3
𝐲 ∈ 0,1,2 𝑁 1 6.3 3.3 4.7 1.6
𝑋 ∈ ℝ𝑁×4 (design matrix) 2 5.9 3.0 5.1 1.8
Images and full dataset: [Link]
from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]
Dataset notation
𝑖 𝑖 𝑁 Species
Sepal Sepal Petal Petal
𝒟= 𝑦 ,𝐱 𝑖=1
Length Width Length Width
𝑁 0 4.3 3.0 1.1 0.1
𝑖 𝑖 𝑖 𝑖 𝑖
= 𝑦 , 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 0 4.9 3.6 1.4 0.1
𝑖=1 0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
1 5.7 2.8 4.1 1.3
Data point 𝑖 = 6: 𝑦 6 , 𝐱 (6) 1 6.3 3.3 4.7 1.6
2 5.9 3.0 5.1 1.8
Images and full dataset: [Link]
from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]
Dataset notation
𝑖 𝑖 𝑁 Species
Sepal Sepal Petal Petal
𝒟= 𝑦 ,𝐱 𝑖=1
Length Width Length Width
𝑁 0 4.3 3.0 1.1 0.1
𝑖 𝑖 𝑖 𝑖 𝑖
= 𝑦 , 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 0 4.9 3.6 1.4 0.1
𝑖=1 0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
Linear algebra can represent all data
1 5.7 2.8 4.1 1.3
𝐲 ∈ 0,1,2 𝑁 1 6.3 3.3 4.7 1.6
𝑋 ∈ ℝ𝑁×4 (design matrix) 2 5.9 3.0 5.1 1.8
Images and full dataset: [Link]
Task: Classification
ML Task: Classification
Predict species label from first two input measurements
ℎ 𝐱 → 𝑦ො
Sepal Sepal
Species
Length Width
0 4.3 3.0
0 4.9 3.6
0 5.3 3.7
1 4.9 2.4
1 5.7 2.8
1 6.3 3.3
Images and full dataset: [Link]
Notation alert: Indicator function
Classification 1 if 𝑧 is true
𝕝 𝑧 = 𝟏(𝑧) = ቊ
0 otherwise
Iris data example
𝑖 𝑖 𝑁 (𝑖) 4 (𝑖)
𝒟= 𝐱 ,𝑦 𝑖=1
, where 𝐱 ∈ ℝ , 𝑦 ∈ {0, 1, 2}
Predict species label from input measurements
ℎ 𝐱 → 𝑦ො Species
Sepal Sepal Petal Petal
Length Width Length Width
Performance measure? 0 4.3 3.0 1.1 0.1
Classification error rate 0 4.9 3.6 1.4 0.1
0 5.3 3.7 1.5 0.2
▪ Fraction of times 𝑦 ≠ 𝑦ො in a given
1 4.9 2.4 3.3 1.0
dataset
1 1 5.7 2.8 4.1 1.3
▪ σ𝑁𝑖=1 𝕝 𝑦 𝑖 ≠𝑦 ො 𝑖
1 6.3 3.3 4.7 1.6
𝑁
2 5.9 3.0 5.1 1.8
Images and full dataset: [Link]
ML Tasks
Supervised learning: Pairs of input and output in training data
𝑖 𝑖 𝑁
𝒟= 𝐱 ,𝑦 𝑖=1
ℎ 𝐱 → 𝑦ො
Classification
▪ Output labels
▪ 𝑦 ∈ 𝒴, where 𝒴 is discrete and order of values has no meaning
Regression
▪ Output values
▪ 𝑦 ∈ 𝒴, where 𝒴 is usually continuous, order of values has meaning
Unsupervised Tasks
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???
▪ Training data has no output values
▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning
Task: Face Generation
[Link]
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???
▪ Training data has no output values
▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning
Example: Unsupervised autoencoder → Random image generation
𝐱 → ℎ 𝐱 → 𝐱ො
𝐱 → 𝑓 𝐱 →𝐳→ 𝑔 𝐳 → 𝐱ො 𝐳 → 𝑔 𝐳 → 𝐱ො
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???
▪ Training data has no output values
▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning
Example: Text Generation
Vocab pause Experience/Data Performance Measure
Task Input ▪ Objective function
▪ Prediction ▪ Input feature Classification
▪ Inference ▪ Measurement ▪ Error rate
▪ Hypothesis function ▪ Attribute ▪ Accuracy rate
▪ Classification Output Regression
▪ Regression ▪ Target ▪ Mean squared error
▪ Class/category/label
▪ True output Training
▪ Measured output ▪ Model
▪ Predicted output ▪ Model structure
Supervised ▪ Model parameters
Unsupervised
Training and ML Models
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction
Training Data Model
Input and Training Structure and
Measured Output Parameters, 𝜃
𝑖 𝑖 𝑁
𝒟𝑡𝑟𝑎𝑖𝑛 = 𝐱 ,𝑦 𝑖=1
Prediction: ℎ(𝐱)
Predicted
Input
Model Output
𝐱 (𝑛𝑒𝑤) 𝑦ො (𝑛𝑒𝑤)
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction
Training Data
𝐱 (1) , 𝑦 1
Model
𝐱 (2) , 𝑦 2
Training Structure and
𝐱 (3) , 𝑦 3 Parameters, 𝜃
⋮
𝐱 (𝑁) , 𝑦 𝑁
Prediction: ℎ(𝐱)
Predicted
Input
Model Output
𝐱 (𝑛𝑒𝑤) 𝑦ො (𝑛𝑒𝑤)
Task: Car Price Prediction
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Example
Trying to see how much I
should sell my car for.
Prediction
Predicted
Input Model
Output
Task: Car Price Prediction
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
What input features should we use?
Prediction
Predicted
Input Model
Output
Poll 2
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
What input features should we use?
Prediction
Predicted
Input Model
Output
Regression
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Example
Trying to see how much I
should sell my car for.
Looking up data from car
websites, I find the mileage
for a set of cars and the
selling price for each car.
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction
Training Data Model
Input and Training Structure and
Measured Output Parameters
Prediction
Predicted
Input Model
Output
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model?
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Memorization
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Nearest neighbor
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Linear
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model?