STAT 453: Introduction to Deep Learning and Generative Models
Sebastian Raschka
http://stat.wisc.edu/~sraschka/teaching
Lecture 01
What Are Machine Learning
with Applications in Python
And Deep Learning?
An Overview.
Sebastian Raschka STAT 453: Intro to Deep Learning 1
Lecture Topics
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 2
fl
A short teaser: what you
will be able to do after
this course
Sebastian Raschka STAT 453: Intro to Deep Learning 3
Audio Classi cation Using Convolutional Neural
Networks
Sebastian Raschka STAT 453: Intro to Deep Learning 4
fi
3D Convolutional Networks
Sebastian Raschka STAT 453: Intro to Deep Learning 5
Photographic Style Transfer With Deep Learning
Sebastian Raschka STAT 453: Intro to Deep Learning 6
My Research Interests
Cao, Mirjalili, and Raschka (2020)
Rank Consistent Ordinal Regression for Neural Networks with Raschka and Kaufman (2020)
Application to Age Estimation . Pattern Recognition Letters. 140, 325-331 Machine Learning and AI-based Approaches for Bioactive
Ligand Discovery and GPCR-ligand Recognition
Elsevier Methods, 180, 89–110
Mirjalili, Raschka, and Ross (2020)
PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy Yu and Raschka (2020)
IEEE Transactions in Image Processing. Vol. 29, pp. 9400-9412, 2020 Looking Back to Lower-level Information in Few-shot Learning
Information 2020, 11, 7
Bemister-Buf ngton, Wolf, Raschka,
and Kuhn (2020)
Machine Learning to Identify
Flexibility Signatures of Class A
Raschka, Patterson, and Nolet (2020)
GPCR Inhibition
Machine Learning in Python: Main Developments and
Biomolecules 2020, 10, 454.
Technology Trends in Data Science, Machine Learning,
and Arti cial Intelligence
Information 2020, 11, 4
Sebastian Raschka STAT 453: Intro to Deep Learning 7
fi
fi
About the course
Sebastian Raschka STAT 453: Intro to Deep Learning 8
Topics Planned 1/2
Part 1: Introduction
• L01: Course overview, introduction to deep learnin
• L02: The brief history of deep learnin
• L03: Single-layer neural networks: The perceptron algorith
Part 2: Mathematical and computational foundations
• L04: Linear algebra and calculus for deep learnin
• L05: Parameter optimization with gradient descen
• L06: Automatic differentiation with PyTorc
• L07: Cluster and cloud computing resource
Part 3: Introduction to neural networks
• L08: Multinomial logistic regressio
• L09: Multilayer perceptrons and backpropratio
• L10: Regularization to avoid over ttin
• L11: Input normalization and weight initializatio
• L12: Learning rates and advanced optimization algorithms
Sebastian Raschka STAT 453: Intro to Deep Learning 9
fi
n
Topics Planned 2/2
Part 4: Deep learning for computer vision and language modeling
• L13: Introduction to convolutional neural network
• L14: Convolutional neural networks architecture
• L15: Introduction to recurrent neural networks
Part 5: Deep generative models
• L16: Autoencoder
• L17: Variational autoencoder
• L18: Introduction to generative adversarial network
• L19: Evaluating generative adversarial network
• L20: Recurrent neural networks for seq-to-seq modelin
• L21: Self-attention and transformer networks
Sebastian Raschka STAT 453: Intro to Deep Learning 10
s
Course Material and Info
Sebastian Raschka STAT 453: Intro to Deep Learning 11
Weekly Content
Sebastian Raschka STAT 453: Intro to Deep Learning 12
Grading
• 30% Problem Sets (HW assignments and quizzes)
• 20% Midterm Exam
• 50% Class Project:
◦ 5% Project proposal
◦ 20% Project presentation (+ peer review)
◦ 25% Project report (+ peer review)
Sebastian Raschka STAT 453: Intro to Deep Learning 13
Questions & Discussions
Sebastian Raschka STAT 453: Intro to Deep Learning 14
Important!
3) Important info and announcements: Canvas Announcements page
Should be activated by
default, but please
double-check
Sebastian Raschka STAT 453: Intro to Deep Learning 15
What Is
Machine Learning?
A short overview before we jump into
Deep Learning
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 16
fl
Sebastian Raschka STAT 453: Intro to Deep Learning 17
Sebastian Raschka STAT 453: Intro to Deep Learning 18
The Connection Between Fields
Machine Learning
Deep Learning
AI
Sebastian Raschka STAT 453: Intro to Deep Learning 19
Di erent Types Of AI
Arti cial Intelligence (AI):
orig. sub eld of computer science, solving tasks humans are good at (natural language,
speech, image recognition, ...)
Narrow AI:
solving a particular task (playing a game, driving a car, ...)
Arti cial General Intelligence (AGI):
multi-purpose AI mimicking human intelligence across tasks
Machine Learning
Deep Learning
AI
Image source: https://www.imdb.com/title/
tt0470752/
Sebastian Raschka STAT 453: Intro to Deep Learning 20
ff
fi
fi
fi
What This Course Is About
Machine Learning
Deep Learning
AI
E.g.,
generalized linear models,
tree-based methods,
"shallow" networks,
E.g., symbolic expressions, support vector machines,
logic rules / "handcrafted" nearest neighbors, ...
nested if-else programming Main focus of the course
statements ...
Sebastian Raschka STAT 453: Intro to Deep Learning 21
Not all AI Systems involve Machine Learning
Deep Blue used custom VLSI chips to
execute the alpha-beta search
algorithm in parallel, an example of
GOFAI
(Good Old-Fashioned Arti cial
Intelligence).
Machine Learning
Deep Learning
AI
Image Source: https://mashable.com/
2016/02/10/kasparov-deep-blue/
Sebastian Raschka STAT 453: Intro to Deep Learning 22
fi
Examples From The Three Related "Areas"
Machine Learning
Deep Learning
AI
Algorithms that learn
models/representations/
rules automatically
from data/examples
A non-biological system
that is intelligent
through rules Algorithms that parameterize multilayer
neural networks that then learn
representations of data with multiple layers
of abstraction
Sebastian Raschka STAT 453: Intro to Deep Learning 23
Some Applications Of Machine Learning/Deep Learning
• Email spam detection
• Fingerprint / face detection & matching (e.g., phones)
• Web search (e.g., DuckDuckGo, Bing, Google)
• Sports predictions
• Post o ce (e.g., sorting letters by zip codes)
• ATMs (e.g., reading checks)
• Credit card fraud
• Stock predictions
Source: http://yann.lecun.com/exdb/lenet/
Sebastian Raschka STAT 453: Intro to Deep Learning 24
ffi
Some Applications Of Machine Learning/Deep Learning
• Smart assistants (Apple Siri, Amazon Alexa, ...)
• Product recommendations (e.g., Net ix, Amazon)
• Self-driving cars (e.g., Uber, Tesla)
• Language translation (Google translate)
• Sentiment analysis
• Drug design
• Medical diagnoses
• ...
Source: https://techcrunch.com/2017/11/07/waymo-now-testing-its-
self-driving-cars-on-public-roads-with-no-one-at-the-wheel/
Sebastian Raschka STAT 453: Intro to Deep Learning 25
fl
The 3 Broad
Categories of ML
(This also applies to DL)
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 26
fl
The 3 Broad Categories Of ML (And DL)
Labeled data
Supervised Learning Direct feedback
Predict outcome/future
No labels/targets
Unsupervised Learning No feedback
Find hidden structure in data
Decision process
Reinforcement Learning Reward system
Learn series of actions
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning 27
Supervised Learning Is The Largest Subcategory
Labeled data
Supervised Learning Direct feedback
Predict outcome/future
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
No labels/targets
Unsupervised Learning No feedback
Find hidden structure in data
Decision process
Reinforcement Learning
Sebastian Raschka
Reward system
STAT 453: Intro to Deep Learning 28
Supervised Learning 1: Regression
target y
(dependent variable,
output)
x
feature (input, observation)
Source: Raschka and Mirjalili (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning 29
Supervised Learning 2: Classi cation
Binary classi cation example with two features ("independent" variables, predictors)
What are the linear decision boundary
class labels (y's)?
x2
x1
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning 30
fi
fi
Supervised Learning 3: Ordinal regression
• Ordinal regression also called ordinal classi cation or ranking
(although ranking is a bit di erent)
Order dependence like in metric regression, discrete values like in classi cation,
but no metric distance but order dependence
rK ≻ rK−1 ≻ . . . ≻ r1
E.g., movie ratings: great ≻ good ≻ okay ≻ for genre fans ≻ bad
Sebastian Raschka STAT 453: Intro to Deep Learning 31
fi
ff
fi
Supervised Learning 3: Ordinal regression
• Ranking: Predict Correct order
(0 loss if order is correct, e.g., rank a collection of movies by "goodness")
≻ ≻
• Ordinal regression: Predict correct (ordered) label
(E.g., age of a person in years; here, regard aging as a non-stationary process)
Excerpt from the UTKFace dataset
https://susanqq.github.io/UTKFace/ ≻ ≻
18 29 41
Sebastian Raschka STAT 453: Intro to Deep Learning 32
Supervised Learning 3: Ordinal regression
• Ranking: Predict correct order
(0 loss if order is correct, e.g., rank a collection of movies by "goodness")
• Ordinal regression: Predict correct (ordered) label
(E.g., age of a person in years; here, regard aging as a non-stationary process)
Excerpt from the UTKFace dataset
https://susanqq.github.io/UTKFace/ ≻ ≻
18 29 41
Sebastian Raschka STAT 453: Intro to Deep Learning 33
The 2nd Subcategory Of ML (And DL)
Labeled data
Supervised Learning Direct feedback
Predict outcome/future
No labels/targets
Unsupervised Learning No feedback
Find hidden structure in data
Decision process
Reinforcement Learning Reward system
Learn series of actions
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning 34
Unsupervised Learning 1:
Representation Learning/Dimensionality Reduction
E.g., Principal Component Analysis (PCA)
x2
PC
2
PC PC1
PC2 1
PC2 PC1
x2
x1
PC1
Sebastian Raschka STAT 453: Intro to Deep Learning 35
Unsupervised Learning 1:
Representation Learning/Dimensionality Reduction
E.g., Autoencoders
Encoder Decoder
Source: https://3.bp.blogspot.com/-OUd11VBJNAM/
VsFacR_YhBI/AAAAAAAABh0/ZKfKAnRj3x0/s1600/
cannot%2Bresist.jpg
latent representation/
feature embedding
(covered later in this course)
Sebastian Raschka STAT 453: Intro to Deep Learning 36
Reminder: Classi cation works like this
x= Network p(y=cat)
Source: https://3.bp.blogspot.com/-OUd11VBJNAM/
VsFacR_YhBI/AAAAAAAABh0/ZKfKAnRj3x0/s1600/
cannot%2Bresist.jpg
y = Cat
Sebastian Raschka STAT 453: Intro to Deep Learning 37
fi
Unsupervised Learning 2: Clustering
Assigning group memberships to unlabelled examples (instances, data points)
x2
x1
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning 38
Reinforcement Learning:
The third subcategory of ML (and DL)
AI-based GPCR bioactive ligand discovery
1 2
Agent
Action
State: At
St
Reward:
Rt
CH3
Rt+1
Environment
St+1
CH3 3
Figure 5: Representation of the basic reinforcement learning paradigm with a simple molecular example. (1) Given a
benzene ring (state St at iteration t) and some reward value Rt at iteration t, (2) the agent selects an action At that adds
a methyl group to the benzene ring. (3) The environment considers this information for producing the next state (St+1 )
and reward (Rt+1 ). This cycle repeats until the episode is terminated.
Source: Sebastian Raschka and Benjamin Kaufman (2020)
the courselearning
Machine of an episode. However,
and AI-based DrugEx adds
approaches a stochastic
for bioactive component
ligand during
discovery andtraining. Beforerecognition
GPCR-ligand being used in the
reinforcement learning training, the RNN agent network was individually trained with a large set of molecule SMILES
from ZINC [21]. Two copies of this pre-trained network were then created, with one referred to as the exploration
network and the other as the exploitation network. Only the exploitation network was updated during the reinforcement
learning training process; however, with a specified probability at each iteration, the exploration network would be
(Won't cover this in this
queried for the next token instead. The purpose of this procedure was to explore a wider chemical space during
course)
training – afterwards the exploration network was discarded, and only the exploitation network was used to generate
new molecules. This methodSebastian
successfully rediscoveredSTAT
Raschka
some453:
known actives for adenosine A2A receptor. The RNN
Intro to Deep Learning 39
Reinforcement Learning:
The third subcategory of ML (and DL)
Vinyals, Oriol, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha
Vezhnevets, Michelle Yeo, Alireza Makhzani et al. "Starcraft II: A new challenge for
reinforcement learning." arXiv preprint arXiv:1708.04782 (2017).
Sebastian Raschka STAT 453: Intro to Deep Learning 40
Semi-Supervised Learning
• mix between supervised and unsupervised
learning
• some training examples contain outputs, but
some do not
• use the labeled training subset to label the
unlabeled portion of the training set, which we
then also utilize for model training
Sebastian Raschka STAT 453: Intro to Deep Learning 41
Semi-Supervised Learning
Illustration of semi-supervised learning incorporating unlabeled examples. (A) A decision boundary derived
from the labeled training examples only. (B) A decision boundary based on both labeled and unlabeled
examples.
Sebastian Raschka STAT 453: Intro to Deep Learning 42
Self-Supervised Learning
• A recent development and promising research
trend in deep learning
• particularly useful if pre-trained models for transfer
learning are not available for the target domain
• a process of deriving and utilizing label information
directly from the data itself rather than having
humans annotating it
Sebastian Raschka STAT 453: Intro to Deep Learning 43
Self-Supervised Learning
Self-supervised learning via context prediction. (A) A random patch is sampled (red square) along with 9
neighboring patches. (B) Given the random patch and a random neighbor patch, the task is to predict
the position of the neighboring patch relative to the center patch (red square).
Sebastian Raschka STAT 453: Intro to Deep Learning 44
The Supervised
Learning Work ow
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 45
fl
fl
Supervised Learning Work ow
1 Training
New
Observations
Feature
Feature
Extraction
Observations Extraction
Learning Algorithm Model 2
Observations
Inference
Labels
Training Dataset
Predicted Labels
fl
Using a test dataset to evaluate the performance of a
predictive model
New
Observations Observations
Observations
Feature
Extraction
Labels
Model
Known Labels Predicted Labels
Sebastian Raschka STAT 453: Intro to Deep Learning 47
Structured vs Unstructured Data
Sebastian Raschka STAT 453: Intro to Deep Learning 48
Machine Learning vs Deep Learning
Image source: Stevens et al., Deep Learning with PyTorch. Manning, 2020
Sebastian Raschka STAT 453: Intro to Deep Learning 49
Machine Learning vs Deep Learning
Image source: Stevens et al., Deep Learning with PyTorch. Manning, 2020
Sebastian Raschka STAT 453: Intro to Deep Learning 50
Machine Learning
Terminology and Notation
(Again, this also applies to DL)
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 51
fl
Machine Learning Jargon 1/2
• supervised learning:
learn function to map input x (features) to
output y (targets)
• structured data:
databases, spreadsheets/csv les
• unstructured data:
features like image pixels, audio signals,
text sentences
(before DL, extensive feature engineering Source: http://rasbt.github.io/mlxtend/
user_guide/image/extract_face_landmarks/
was required)
Sebastian Raschka STAT 453: Intro to Deep Learning 52
fi
Supervised Learning (More Formal Notation)
"training examples"
[i] [i]
Training set: = {⟨x , y ⟩, i = 1,… , n},
Unknown function: f(x) = y
sometimes t or o
Hypothesis: h(x) = ŷ
Classi cation Regression
m m
h:ℝ → , = {1,...,k} h:ℝ →ℝ
𝒟
𝒴
𝒴
Sebastian Raschka STAT 453: Intro to Deep Learning 53
fi
Data Representation
x1
x2
x=
⋮
xm
Feature vector
Sebastian Raschka STAT 453: Intro to Deep Learning 54
Data Representation
x1T x1[1] x2[1] ⋯ xm[1]
x1
x2 xT2 x1[2] x2[2] ⋯ xm[2]
X= X=
x= ⋮ ⋮ ⋮ ⋱ ⋮
⋮
xm xTn x1[n] x2[n] ⋯ xm[n]
Feature vector Design Matrix Design Matrix
Sebastian Raschka STAT 453: Intro to Deep Learning 55
Data Representation (structured data)
m= _____
n= _____
Sebastian Raschka STAT 453: Intro to Deep Learning 56
Data Representation (unstructured data; images)
"traditional methods"
Sebastian Raschka STAT 453: Intro to Deep Learning 57
Data Representation (unstructured data; images)
Convolutional Neural Networks
Image batch dimensions: torch.Size([128, 1, 28, 28]) "NCHW" representation (more on that later)
Image label dimensions: torch.Size([128])
Sebastian Raschka STAT 453: Intro to Deep Learning 58
Machine Learning Jargon 2/2
• Training a model = fitting a model = parameterizing a model = learning from data
• Training example, synonymous to
training record, training instance, training sample (in some contexts, sample refers to a
collection of training examples)
• Feature, synonymous to
observation, predictor, variable, independent variable, input, attribute, covariate
• Target, synonymous to
outcome, ground truth, output, response variable, dependent variable, (class) label (in
classi cation)
• Output / Prediction, use this to distinguish from targets; here, means output from the model
Sebastian Raschka STAT 453: Intro to Deep Learning 59
fi
The Practical Aspects:
Our Tools!
1. Course overview
2. What is machine learning?
3. The broad categories of ML
4. The supervised learning work ow
5. Necessary ML notation and jargon
6. About the practical aspects and tools
Sebastian Raschka STAT 453: Intro to Deep Learning 60
fl
Main Scienti c Python Libraries
Stat 451 FS2020
(Machine Learning)
Main tools for this course Image by Jake VanderPlas. Source:
https://speakerdeck.com/jakevdp/the-state-of-the-stack-scipy-2015-keynote?slide=8)
Sebastian Raschka STAT 453: Intro to Deep Learning 61
fi
"The State of Machine Learning Frameworks in 2019"
Source:
https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensor ow-dominates-industry/
Sebastian Raschka STAT 453: Intro to Deep Learning fl
62
"The State of Machine Learning Frameworks in 2019"
Source:
https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensor ow-dominates-industry/
Sebastian Raschka STAT 453: Intro to Deep Learning fl
63
Source: http://yann.lecun.com/exdb/lenet/
https://code.visualstudio.com
https://github.com/rasbt/stat453-deep-learning-ss21/tree/main/L01/code
Sebastian Raschka STAT 453: Intro to Deep Learning 64
Sebastian Raschka STAT 453: Intro to Deep Learning 65
Further Resources and Reading Materials
• "Introduction to Machine Learning and Deep Learning", article based on
these slides https://sebastianraschka.com/blog/2020/intro-to-dl-ch01.html
• STAT451 FS2021: Intro to machine Learning, lecture notes:
https://github.com/rasbt/stat451-machine-learning-fs20/blob/master/
L01/01-ml-overview__notes.pdf
• Python Machine Learning, 3rd Ed. Packt 2019. Chapter 1.
Sebastian Raschka STAT 453: Intro to Deep Learning 66
Next Lecture:
A Brief Summary of the
History of
Neural Networks and Deep Learning
Sebastian Raschka STAT 453: Intro to Deep Learning 67