0% found this document useful (0 votes)
23 views28 pages

CSE 445 - Lecture 1 - Machine Learning Introduction

Uploaded by

clutchandgear201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views28 pages

CSE 445 - Lecture 1 - Machine Learning Introduction

Uploaded by

clutchandgear201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CSE 445: Machine Learning

Introduction
Resources
 Slides provided in course should be enough – but there is a plethora of
fantastic resources available, so use them!
 Recommended Books:
 Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow by Aurelien
Geron (will be followed extensively in the course with code examples from
https://github.com/ageron/handson-ml )
 Pattern Recognition and Machine Learning by Christopher Bishop (excellent
resource for mathematical foundations)
 Elements of Statistical Learning by Jerome Friedman et al (good reference)
 Additional Material:
 Andrew Ng’s course on Machine Learning available on Coursera
 CS 189, Berkeley
 CS 229, Stanford
Helpful Prerequisites
 MAT361- Probability & Statistics
 Probability distribution, Random Variable, Conditional Probability, Variance (some
of the important concepts to recall to name a few)
 MAT125 – Linear Algebra
 Matrix Multiplication, Eigenvalues, Eigenvectors

 Basic programming background in Python (an OK understanding of python


syntax is all that’s necessary – Geron’s textbook has excellent code examples)

 None of them are compulsory – easier to grasp the material if completed


Assessment (tentative)
 In-class pop quizzes on Socrative (15%)
 Midterm (20%)
 Final (30%)
 Project (30%)
 Class Participation (5%)
Course Project
 Groups of up to 4 members (4 is a hard maximum)
 Video Demo submission at the end of the semester, and in-person/online
presentation at the end of the semester
 4-6 page Report due at semester end, IEEE format – must include link to Github
repo

 Potential Topics (few examples):


 Covid-19
 Computer Vision
 Natural Language Processing
 Reinforcement Learning
 Speech & Music Recognition
 Biomedical Imaging and Biosignals
What is Artificial Intelligence?

The science of making machines that:

Think like people Think rationally

Act like people Act rationally

“Machines that act rationally” – a fairly broad definition!


What is Machine Learning?
Tom Mitchell (1998): a computer program is
said to learn from experience E with respect
to some class of tasks T and performance
measure P, if its performance at tasks in T, as
measured by P, improves with experience E.

Example:
Task: Playing Checkers
Experience (data): games played by the
program (with itself)
Performance measure: winning rate Image from Tom Mitchell’s homepage
Definition of Machine Learning
Arthur Samuel (1959): Machine Learning is the
field of study that gives the computer the ability
to learn without being explicitly programmed.

Photos from Wikipedia


Traditional Programming

• Traditional Programming: writing a set of RULES to find


ANSWERS from DATA
The ML Approach
Machine Learning: Use DATA and ANSWERS to learn the underlying set of RULES

Great for:

• Problems that require a lot of fine-


tuning or long list of rules
• Changing environments – ML
systems can ADAPT

• Getting insights from large amounts


of data

• Complex problems that yields no good


solution with traditional approach
Deep Learning
 Subset of ML - loosely mimics
structure/function of human brain
 Unlike traditional ML, does not require
manual feature extraction
 Keeps getting better with more data
(typically)
 Computer Vision (CNN, GAN)
 Natural Language Processing (RNN,
LSTM)
 Automatic Speech Recognition (RNN)
Summary – AI vs ML vs DL
 Subsets of each other
 1950 – 1990: AI in the form of Expert systems (airplane
autopilot) and Games (checkers, chess)
 1990- : Statistical Approaches with ML, busts AI winter
 2010 - : Deep Learning revolutionizes CV, NLP among
other applications
 Narrow AI
 Systems can do a few defined things (such as playing
chess, or driving a car) as well, or better than humans
 Can’t do EVERYTHING a human being can do – yet
 AI is not “taking over the world” anytime soon
 Tell your uncles to relax and stop using Whatsapp
What kind of ML system is it?
 Useful to classify ML systems based on the following criteria:
1. Does it require human supervision? 3. Does the system build a predictive model?
 Model-based Learning
 Supervised Learning
 Instance-based Learning
 Semisupervised Learning
 Unsupervised Learning
 Reinforcement Learning
• These are not exclusive – can be
combined
2. Can it learn incrementally on the fly?
 Online Learning
• e.g. Spam filter may learn on the fly
 Batch Learning with a deep neural network – online,
model-based, supervised learning
system
Supervised Learning
 Training data fed to algorithm
includes the desired
answers/solutions (labels)
 Example algorithms:
 Linear Regression
 Logistic Regression
 SVM
 Decision Tree
 Neural Network
Unsupervised Learning
 Training data is unlabeled
 System learns without direct human
supervision
 Widely used in:
 Clustering
 Anomaly detection
 Association mining
 Data preprocessing
 Example algorithms:
 K-means
 PCA
 SVD
 ICA
Semisupervised Learning
 Partially labeled data
 Unsupervised learning used
to cluster similar data
together
 Human input taken to label
the clusters
 e.g. Google Photos will
cluster similar faces, and ask
the user if they are the
same person
Reinforcement Learning
 The learning system (agent) can:
 Observe the environment
 Select and perform an action based on
environment
 Get rewards/penalties as a result
 Based on the reward agent will changes
its state
 Agent aim: Maximize reward
 Learns what the best policy should be
 Policy defines what actions should be
chosen in a certain situation
 Very effective in controlled environments
(such as a game of chess)
 With the progress in deep learning,
increasingly used in more complex tasks
(such as driving the mars rover)
Batch Learning vs Online Learning
 Batch Learning
 Not capable of learning after
deployment
 Must be retrained from scratch –
computationally expensive!
 Online Learning
 Can continue to learn after
deployment
 Can take advantage of parallel
computing – no down time
 Preferred choice in production
Instance-Based vs Model-Based Learning
 Two approaches to generalization
 Instance-based Learning
 Memorize known data
 Use similarity measure to generalize
new instances
 e.g. new instance is a triangle because
it’s similar to the other triangles
 Model-based Learning
 Build model from training data
 Predict based on model output
Example ML Task: Does money make people happy?

• Life Satisfaction data from OECD


• GDP per capita data from IMF

What relationship
can we infer between
life satisfaction and
GDP per capita from
the graph?
Model Selection
 Based on data, we can select a linear model of life satisfaction with just one
feature/attribute: GDP per capita
𝑙𝑖𝑓𝑒_𝑠𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑖𝑜𝑛 = 𝜃0 + 𝜃1 ∗ 𝐺𝐷𝑃_𝑝𝑒𝑟_𝑐𝑎𝑝𝑖𝑡𝑎
 The model has two parameters: the y-intercept 𝜃0 and the slope 𝜃1
 How to figure out the parameter values?
Performance Measure
 Define a utility function (how good is the fitted line?), or a cost function (how bad is
the fitted line?)
 Linear Regression
 Training: try to minimize distance between linear model’s prediction on the line, vs
the actual training examples, until the estimated parameter values converge
 In this example, Linear Regression gives 𝜽𝟎 = 𝟒. 𝟖𝟓 and 𝜽𝟏 = 𝟒. 𝟗𝟏 ∗ 𝟏𝟎−𝟓
Problems with Machine Learning
 3 V’s of Big Data
 Volume, Variety, Velocity
 Problem #1: Training data!
 Insufficient quantity
 Nonrepresentative data
 Poor-quality data
 Problem #2: How “fit” is it?
 Overfitting data
 Underfitting data
 Problem #3: Which features should be used?
 Deep Learning automates feature selection
Overfitting
 Most common problem in ML – do not overgeneralize!
 The polynomial model is better than the linear model on training
 How about testing?
How to avoid overfitting
 Tip #1: REGULARIZATION – USE IT
 Constrain model to keep it simple – reduce risk of overfitting
 If you can stand on one leg, you’ll be able to stay balanced with two legs
 Hyperparameters – control level of regularization
 Tip #2: Get more training data, and reduce noise in it
Model Evaluation
 How good is your model?
 Test it on new data – data not seen by the model ever before!
 Keep 80% for training, set 20% for testing
 NEVER go below 10% test data – better model is better than better
“accuracy”
 How to regularize?
 Keep a portion of training data held out for validation
 Alternatively, use cross-validation (many validation sets instead of one)
 Pick the hyperparameters that work best on validation for your model
on the test dataset
Ratios
 A great model
 trained with 60% training data, 20% validation data, and 20% testing data
 An okay model
 trained with 70% training data, 15% validation data, and 15% testing data
 A barely-acceptable model
 trained with 80% training data, 10% validation data, and 10% testing data
 Models with worse ratios – hacks
 Unless there’s millions of instances in the dataset
 “No Free Lunch” theorem
 Only way to know for sure which model works best is to evaluate them
 Make reasonable assumptions about your data to select model

You might also like