Lecture I : Introduction to
Machine Learning
Outline
Machine Learning definition
Types of Machine Learning models
Summarize
Machine Learning (ML) enables computers to learn and improve
performance through experience without explicit programming.
ML focuses on detecting patterns in data and using them for future
predictions.
Three main types of ML: Supervised Learning, Unsupervised Learning,
and Reinforcement Learning.
Supervised Learning uses labeled data to predict outputs for new inputs.
Unsupervised Learning finds hidden patterns in unlabeled data, in order to
use them for clustering/dimensionality reduction.
Reinforcement Learning involves an agent learning through interaction
with an environment
I/Machine learning definition & tools
1. Definition
Arthur Samuel, 1959 : Machine Learning (ML) is a field of study that gives
computer the ability to learn without being explicitly programmed.
Tom Mitchell, 1997 : Machine Learning (ML) is any computer program that
improves it performance P through experience E at some task T
Lecture I : Introduction to Machine Learning 1
For example : Giving an problem : Checker-playing program
T : Playing checkers
E : The checker-playing program gain experience through millions of
games competing with players (or even against itself)
P : The ability of the program to gain a win in the next game
Kevin Murphy, 2012 : Machine Learning (ML) is methods that can
automatically detect patterns in data, then use the uncovered patterns to
predict future data.
2. Type of machine learning
1. Supervised Learning
Given a labeled set of input - output pairs D = { xi , yi }n
i=1
Lecture I : Introduction to Machine Learning 2
D is the training set
N is the number of labeled examples
x is a d-dimensional vector of features/attributes/covariate
y is a response variable
Based on this dataset, we will create a machine learning model that learns
from the data to predict the corresponding output for each new input.
Classification when y is categorical
y is the set of discrete variables, for exp : y ∈ {1,…,C}
Regression when y is real-valued, y ∈R
Probabilistic models p(y | x, D, θ|)
Probabilistic models are methods where the predicted outcomes are
expressed in probabilities. These models do not just predict a single value but
provide probabilities for each possible outcome, allowing users to understand
the confidence level of the predictions.
2. Unsupervised Learning
Given a set of inputs D = { xi } n
i=1
D is the input data
N is the number of examples
x is a d-dimensional vector of features/attributes/covariate
there is no response variable
Density Function p(xi ∣θ)
Density function are methods where hidden patterns or structures in the
data are found.
Clustering :
Clusters : subgroups or subpopulation in the data
Goals :
Discovering the subgroups
Lecture I : Introduction to Machine Learning 3
Estimating which subgroup a data point belongs to
Dimensionality Reduction
Curse of dimensionality : problems when dealing with high-dimensional
data which has difficulty in measuring meaningful distances, and a higher
risk of overfitting.
Dimensionality Reduction : Dimensionality reduction is a technique used
to reduce the number of features in a dataset while preserving its essential
information
3. Reinforcement Learning
Reinforcement learning (RL) is a type of machine learning where an agent
learns to make decisions by interacting with an environment. The agent takes
actions to maximize cumulative rewards over time by receiving feedback in
the form of rewards or penalties.
Reinforcement Learning Diagram
Lecture I : Introduction to Machine Learning 4