0% found this document useful (0 votes)

17 views38 pages

S24 Lecture 2 ML Problem Formulation

The document discusses the formulation of machine learning problems, focusing on the components of task, performance measure, and experience. It outlines various machine learning tasks such as classification and regression, using examples like the Iris dataset to illustrate input-output relationships. Additionally, it touches on the roles of agents, data representation, and the distinction between supervised and unsupervised learning.

Uploaded by

Shruthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views38 pages

S24 Lecture 2 ML Problem Formulation

Uploaded by

Shruthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

10-315

Machine Learning
Problem Formulation

Instructor: Pat Virtue

Today
Autoencoder (Aliens) (previous slides)
▪ Features
ML Problem Formulation
▪ Task input and output
▪ Task, Performance, Experience
▪ Data and notation
▪ Examples: Iris Classification and Car Price Regression
ML Training and Models
▪ Linear
▪ Memorization
▪ Nearest Neighbor
ML Problem Formulation
Agents
An agent is an entity that perceives
and acts.
Actions can have an effect on the
environment.
The specific sensors and actuators
affect what the agent is capable
of perceiving and what actions it
is capable of taking

Environment
Sensors
Percepts

Agent
?

Actuators
Actions
Slide credit: [Link]
Agent: Simple Input/Output Task

Agent
Predicted
Input ? Output
Task Input and Output
Input Task Output
Petal measurements Iris classification Category

Time of day Traffic prediction Traffic Volume

Image Image classification Category

Image Image denoising Image

Text Text to image generation Image

??? Face generation Image

Task: Face Generation
[Link]
Machine Learning Problem Formulation
Three components <T,P,E>:
1. Task, T
2. Performance measure, P
3. Experience, E

Definition of learning:
A computer program learns if its performance at tasks in T, as
measured by P, improves with experience E

8
Definition from (Mitchell, 1997)
Machine Learning Problem Formulation
Notation
Task
Formalize the task as a mapping from input to output ℎ 𝑥 → 𝑦ො
Experience
𝑁
Data! Task experience examples will usually be pairs: 𝒟= 𝑥 (𝑖) , 𝑦 (𝑖) 𝑖=1
(input, measured output)

Performance measure
Objective function that gives a single numerical value 𝑁
representing how well the system performs for a given 1 𝑖 𝑖
෍𝕝 𝑦 ≠ 𝑦ො
dataset 𝑁
𝑖=1
▪ Classification: error rate
𝑁
▪ Regression: mean squared error 1 𝑖 𝑖 2
෍ 𝑦 − 𝑦ො
Slide: CMU ML, Tom Mitchel and Roni Rosenfeld
𝑁
𝑖=1
Notation alert: Indicator function
ML Problem Formulation 𝕝 𝑧 = 𝟏(𝑧) = ቊ
1 if 𝑧 is true
0 otherwise
Task
Formalize the task as a mapping from input to output ℎ 𝑥 → 𝑦ො
Experience
𝑁
Data! Task experience examples will usually be pairs: 𝒟= 𝑥 (𝑖) , 𝑦 (𝑖) 𝑖=1
(input, measured output)

Performance measure
Objective function that gives a single numerical value 𝑁
representing how well the system performs for a given 1 𝑖 𝑖
෍𝕝 𝑦 ≠ 𝑦ො
dataset 𝑁
𝑖=1
▪ Classification: error rate
𝑁
▪ Regression: mean squared error 1 𝑖 𝑖 2
෍ 𝑦 − 𝑦ො
Slide: CMU ML, Tom Mitchel and Roni Rosenfeld
𝑁
𝑖=1
Experience: Data and Notation
Example Dataset: Fisher Iris Dataset
Fisher (1936) used 150 measurements of flowers from
3 different species: Iris setosa (0), Iris virginica (1), Iris
versicolor (2) collected by Anderson (1936)

Sepal Sepal Petal Petal

Species
Length Width Length Width
0 4.3 3.0 1.1 0.1
0 4.9 3.6 1.4 0.1
0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
1 5.7 2.8 4.1 1.3
1 6.3 3.3 4.7 1.6
2 5.9 3.0 5.1 1.8

Images and full dataset: [Link]

from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]

Dataset notation
𝑖 𝑖 𝑁 Species
Sepal Sepal Petal Petal
𝒟= 𝑦 ,𝐱 𝑖=1
Length Width Length Width
𝑁 0 4.3 3.0 1.1 0.1
𝑖 𝑖 𝑖 𝑖 𝑖
= 𝑦 , 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 0 4.9 3.6 1.4 0.1
𝑖=1 0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
Linear algebra can represent all data
1 5.7 2.8 4.1 1.3
𝐲 ∈ 0,1,2 𝑁 1 6.3 3.3 4.7 1.6
𝑋 ∈ ℝ𝑁×4 (design matrix) 2 5.9 3.0 5.1 1.8

Images and full dataset: [Link]

from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]

Dataset notation
𝑖 𝑖 𝑁 Species
Sepal Sepal Petal Petal
𝒟= 𝑦 ,𝐱 𝑖=1
Length Width Length Width
𝑁 0 4.3 3.0 1.1 0.1
𝑖 𝑖 𝑖 𝑖 𝑖
= 𝑦 , 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 0 4.9 3.6 1.4 0.1
𝑖=1 0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
1 5.7 2.8 4.1 1.3
Data point 𝑖 = 6: 𝑦 6 , 𝐱 (6) 1 6.3 3.3 4.7 1.6
2 5.9 3.0 5.1 1.8

Images and full dataset: [Link]

from sklearn import datasets
Example Dataset: Fisher Iris Dataset iris = datasets.load_iris()
X = [Link]
Assume samples in data are i.i.d. y = [Link]

Images and full dataset: [Link]

Task: Classification
ML Task: Classification
Predict species label from first two input measurements
ℎ 𝐱 → 𝑦ො

Sepal Sepal
Species
Length Width
0 4.3 3.0
0 4.9 3.6
0 5.3 3.7
1 4.9 2.4
1 5.7 2.8
1 6.3 3.3

Predict species label from input measurements

ℎ 𝐱 → 𝑦ො Species
Sepal Sepal Petal Petal
Length Width Length Width
Performance measure? 0 4.3 3.0 1.1 0.1

Classification error rate 0 4.9 3.6 1.4 0.1

0 5.3 3.7 1.5 0.2
▪ Fraction of times 𝑦 ≠ 𝑦ො in a given
1 4.9 2.4 3.3 1.0
dataset
1 1 5.7 2.8 4.1 1.3
▪ σ𝑁𝑖=1 𝕝 𝑦 𝑖 ≠𝑦 ො 𝑖
1 6.3 3.3 4.7 1.6
𝑁
2 5.9 3.0 5.1 1.8

Images and full dataset: [Link]

ML Tasks
Supervised learning: Pairs of input and output in training data
𝑖 𝑖 𝑁
𝒟= 𝐱 ,𝑦 𝑖=1
ℎ 𝐱 → 𝑦ො

Classification
▪ Output labels
▪ 𝑦 ∈ 𝒴, where 𝒴 is discrete and order of values has no meaning

Regression
▪ Output values
▪ 𝑦 ∈ 𝒴, where 𝒴 is usually continuous, order of values has meaning
Unsupervised Tasks
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???

▪ Training data has no output values

▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning
Task: Face Generation
[Link]
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???

▪ Training data has no output values

▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning

Example: Unsupervised autoencoder → Random image generation

𝐱 → ℎ 𝐱 → 𝐱ො

𝐱 → 𝑓 𝐱 →𝐳→ 𝑔 𝐳 → 𝐱ො 𝐳 → 𝑔 𝐳 → 𝐱ො
ML Tasks
Unsupervised learning
𝑖 𝑁
𝒟= 𝐱 𝑖=1
ℎ 𝐱 →???

▪ Training data has no output values

▪ Tasks can vary
▪ Often used to organize data for future (minimally) supervised learning

Example: Text Generation

Vocab pause Experience/Data Performance Measure
Task Input ▪ Objective function
▪ Prediction ▪ Input feature Classification
▪ Inference ▪ Measurement ▪ Error rate
▪ Hypothesis function ▪ Attribute ▪ Accuracy rate
▪ Classification Output Regression
▪ Regression ▪ Target ▪ Mean squared error
▪ Class/category/label
▪ True output Training
▪ Measured output ▪ Model
▪ Predicted output ▪ Model structure
Supervised ▪ Model parameters
Unsupervised
Training and ML Models
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction

Training Data Model

Input and Training Structure and
Measured Output Parameters, 𝜃
𝑖 𝑖 𝑁
𝒟𝑡𝑟𝑎𝑖𝑛 = 𝐱 ,𝑦 𝑖=1

Prediction: ℎ(𝐱)
Predicted
Input
Model Output
𝐱 (𝑛𝑒𝑤) 𝑦ො (𝑛𝑒𝑤)
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction
Training Data
𝐱 (1) , 𝑦 1
Model
𝐱 (2) , 𝑦 2
Training Structure and
𝐱 (3) , 𝑦 3 Parameters, 𝜃
⋮
𝐱 (𝑁) , 𝑦 𝑁
Prediction: ℎ(𝐱)
Predicted
Input
Model Output
𝐱 (𝑛𝑒𝑤) 𝑦ො (𝑛𝑒𝑤)
Task: Car Price Prediction
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Example
Trying to see how much I
should sell my car for.

Prediction
Predicted
Input Model
Output
Task: Car Price Prediction
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
What input features should we use?

Prediction
Predicted
Input Model
Output
Poll 2
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
What input features should we use?

Prediction
Predicted
Input Model
Output
Regression
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)

Example
Trying to see how much I
should sell my car for.
Looking up data from car
websites, I find the mileage
for a set of cars and the
selling price for each car.
Machine Learning
Using (training) data to learn a model that we’ll later use for prediction

Training Data Model

Input and Training Structure and
Measured Output Parameters

Prediction
Predicted
Input Model
Output
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)

Model?
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Memorization
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Nearest neighbor
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)
Model: Linear
Regression Model
Regression: learning a model to predict a numerical output (but not
numbers that just represent categories, that would be classification)

Model?

ML 01
No ratings yet
ML 01
24 pages
Comp Vis Week 2
No ratings yet
Comp Vis Week 2
16 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
GML Slides 2024 04 29
No ratings yet
GML Slides 2024 04 29
206 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
46 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
CSE445 NSU Week - 1
No ratings yet
CSE445 NSU Week - 1
28 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
Predictive Analytics Basics
No ratings yet
Predictive Analytics Basics
16 pages
CSE 440 AI Volume1 (p1)
No ratings yet
CSE 440 AI Volume1 (p1)
4 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
CS115 01
No ratings yet
CS115 01
38 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Types of Machine Learning Explained
No ratings yet
Types of Machine Learning Explained
26 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
01.black Box ML
No ratings yet
01.black Box ML
67 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
First Cours 2
No ratings yet
First Cours 2
42 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Chapter Introduction
No ratings yet
Chapter Introduction
7 pages
Data Analytics - ML Lecturenotes
No ratings yet
Data Analytics - ML Lecturenotes
85 pages
Machine Learning Lecture1
No ratings yet
Machine Learning Lecture1
56 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Building Machine Learning Algorithms
No ratings yet
Building Machine Learning Algorithms
53 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
46 pages
465-Lecture 1 (Deep Learning)
No ratings yet
465-Lecture 1 (Deep Learning)
47 pages
IBest DeepLearning
No ratings yet
IBest DeepLearning
123 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
DSA5102X Lecture1
No ratings yet
DSA5102X Lecture1
51 pages
DSA5105 Lecture1
No ratings yet
DSA5105 Lecture1
51 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
100% (2)
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
21 pages
07 Intro To ML
No ratings yet
07 Intro To ML
38 pages
Lec1 PerceptronPocket Recap
100% (1)
Lec1 PerceptronPocket Recap
61 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Lecture 02
No ratings yet
Lecture 02
34 pages
Understanding Deep Learning Concepts
No ratings yet
Understanding Deep Learning Concepts
78 pages
Lecture1 2015
No ratings yet
Lecture1 2015
52 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
Machine Learning Concepts Guide
No ratings yet
Machine Learning Concepts Guide
34 pages
Week 09 Lesson 1 Intro Machine Learning 1 To 32
No ratings yet
Week 09 Lesson 1 Intro Machine Learning 1 To 32
61 pages
2-Inductive Learning
No ratings yet
2-Inductive Learning
37 pages
Neural Networks
No ratings yet
Neural Networks
38 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Geocluster Mod in Machine Learning
No ratings yet
Geocluster Mod in Machine Learning
124 pages
Week 1 - Artificial Neural Networks - Part I - Justin
No ratings yet
Week 1 - Artificial Neural Networks - Part I - Justin
56 pages
CS229 Lecture Notes
No ratings yet
CS229 Lecture Notes
142 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
22 pages
Box-Level Active Detection Framework
No ratings yet
Box-Level Active Detection Framework
10 pages
Baudm - Logistic Regression
No ratings yet
Baudm - Logistic Regression
18 pages
Stock Forecasting Using Prophet vs. LSTM Model Applying Time-Series Prediction
No ratings yet
Stock Forecasting Using Prophet vs. LSTM Model Applying Time-Series Prediction
8 pages
Roulette Strategy Guide
No ratings yet
Roulette Strategy Guide
6 pages
Weather App Using Android Studio
100% (3)
Weather App Using Android Studio
44 pages
1.4micrometer Screw Gauge
No ratings yet
1.4micrometer Screw Gauge
3 pages
Complexity, Accuracy, Uency and Lexis in Task-Based Performance
No ratings yet
Complexity, Accuracy, Uency and Lexis in Task-Based Performance
40 pages
Science Fiction and The Prediction of The Future Essays On Foresight and Fallacy by Gary Westfahl and Donald E. Palumbo
100% (4)
Science Fiction and The Prediction of The Future Essays On Foresight and Fallacy by Gary Westfahl and Donald E. Palumbo
271 pages
Johnsson 1987
No ratings yet
Johnsson 1987
9 pages
TWC's Big Data Revolution in Weather Analytics
No ratings yet
TWC's Big Data Revolution in Weather Analytics
7 pages
Carbonfinal Report - PDF - 20250714 - 220515 - 0000
No ratings yet
Carbonfinal Report - PDF - 20250714 - 220515 - 0000
36 pages
Crop Yield & Price Forecasting AI
No ratings yet
Crop Yield & Price Forecasting AI
8 pages
Activities To Teach Scientific Process Skills
100% (2)
Activities To Teach Scientific Process Skills
3 pages
Longman Test 5 2022
No ratings yet
Longman Test 5 2022
9 pages
ML Predicts Suicidal Risk with MMPI-2
No ratings yet
ML Predicts Suicidal Risk with MMPI-2
10 pages
Time Series and Forecasting (Assignment 2)
No ratings yet
Time Series and Forecasting (Assignment 2)
3 pages
Case Study - Traffic Flow
No ratings yet
Case Study - Traffic Flow
5 pages
Young Concrete Crack Modeling
No ratings yet
Young Concrete Crack Modeling
11 pages
Understanding the Method of Tenacity
No ratings yet
Understanding the Method of Tenacity
14 pages
042 Thomas K
No ratings yet
042 Thomas K
17 pages
Energy Demand Forecasting and Optimizing Electric
No ratings yet
Energy Demand Forecasting and Optimizing Electric
28 pages
Group 55 Final Report
No ratings yet
Group 55 Final Report
5 pages
Permeability Determination From Well Log Data
No ratings yet
Permeability Determination From Well Log Data
6 pages
Science 7 1st Q Exam 2018
No ratings yet
Science 7 1st Q Exam 2018
5 pages
Lottery & Astrology - How Nakshatras Predict Your Luck
No ratings yet
Lottery & Astrology - How Nakshatras Predict Your Luck
10 pages
Algorithmic Fairness in Economics
No ratings yet
Algorithmic Fairness in Economics
6 pages
Documentation of Our Project
No ratings yet
Documentation of Our Project
21 pages
The Final Energy Crisis
No ratings yet
The Final Energy Crisis
334 pages
Business Intelligence Carlo Vercellis
No ratings yet
Business Intelligence Carlo Vercellis
5 pages
Quantitative Techniques in Decision Making
No ratings yet
Quantitative Techniques in Decision Making
41 pages

S24 Lecture 2 ML Problem Formulation

Uploaded by

S24 Lecture 2 ML Problem Formulation

Uploaded by

10-315

Instructor: Pat Virtue

Time of day Traffic prediction Traffic Volume

Image Image classification Category

Image Image denoising Image

Text Text to image generation Image

??? Face generation Image

Sepal Sepal Petal Petal

Images and full dataset: [Link]

Images and full dataset: [Link]

Images and full dataset: [Link]

Images and full dataset: [Link]

Images and full dataset: [Link]

Predict species label from input measurements

Classification error rate 0 4.9 3.6 1.4 0.1

Images and full dataset: [Link]

▪ Training data has no output values

▪ Training data has no output values

Example: Unsupervised autoencoder → Random image generation

▪ Training data has no output values

Example: Text Generation

Training Data Model

Training Data Model

You might also like