0% found this document useful (0 votes)

53 views5 pages

Intro to Variational Autoencoders

Uploaded by

Duy Pham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views5 pages

Intro to Variational Autoencoders

Uploaded by

Duy Pham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Review: An Introduction to VAE

Type Literature

Introduction
Motivation
Discriminative models predict outcomes based on observed data, whereas
generative models learn the joint distribution of all variables, simulating how
data is generated in the real world

Generative modeling is attractive because it incorporates physical laws

and can simplify unknown details by treating them as noise → intuitive,
interpretable models that are tested against observations to confirm or
reject hypotheses

Transforming a generative model into a discriminative one involves Bayes’

rule, though computational costs are high. Discriminative methods directly
map inputs to predictions, and while efficient with large data → higher bias
if the model assumptions are incorrect

Variational Autoencoders (VAEs) serve this purpose by combining two

models: the encoder (recognition model) and the decoder (generative model)

The encoder approximates the posterior distribution, facilitating

expectation maximization during training

VAEs improve efficiency by “amortized inference”, where a single set of

parameters models the relationship between input and latent variables

The VAE framework is inspired by the Helmholtz Machine but avoids its
inefficiencies by optimizing a single objective through the reparameterization
trick, which reduces gradient noise during learning

Review: An Introduction to VAE 1

VAEs combine graphical models with deep learning, organizing latent
variables in hierarchical Bayesian networks, and are optimized through
expectation maximization and backpropagation

Aim
A principled approach for jointly learning deep latent-variable models and
inference models through stochastic gradient descent → supporting
applications like generative modeling, semi-supervised learning, and
representation learning

The structure of the paper includes a discussion of probabilistic models,

directed graphical models, and their integration with neural networks, as well
as learning approaches for fully observed and deep latent-variable models
(DLVMs)

Chapter 2 covers the basics of VAEs

Chapter 3 explores advanced inference techniques

Chapter 4 addresses advanced generative models

Mathematical notation can be found in section A.1

Probabilistic models and Variational inference

Probabilistic models inherently involve unknown factors → would specify all
correlations and higher-order dependencies between its variables, forming a
comprehensive joint probability distribution

A vector xrepresents all observed variables whose joint distribution is

modeled

The true distribution of x, denoted p∗ (x), is generally unknown, so the

goal is to approximate this with a model pθ (x), where θrepresents the

parameters

The learning process involves finding values for θso that pθ (x)closely

∗
approximates p (x)for any observed x

Review: An Introduction to VAE 2

For effective modeling, pθ (x)must be flexible enough to adapt to the data

while allowing for prior knowledge about the data distribution to be integrated
into the model

Conditional models
Conditional model, pθ (y∣x), is preferred over an unconditional model, pθ (x).

This model approximates the distribution p∗ (y∣x), which represents the

probability distribution over possible values of y(as a label) given an observed
variable x.

xis typically considered the model’s input

The goal is to choose and optimize pθ (y∣x)so that it closely approximates

∗
p (y∣x)for any given xand y

pθ (y∣x) ≈ p∗ (y∣x)

Parameterizing conditional distribution with Neural

networks
Neural networks are used to parameterize probability density functions
(PDFs) or probability mass functions (PMFs) → allowing for stochastic
gradient-based optimization, which enables scaling to large datasets and
models

In applications like image classification, neural networks can parameterize a

conditional distribution pθ (y∣x)over a label y, given an input image x

The network can be represented as a function, denoted NeuralNet(x),

which outputs a set of parameters, p, for the categorical distribution
pθ (y∣x) = Categorical(y; p)

The final layer in such models typically employs a softmax function to

ensure that the sum of output probabilities ∑i pi = 1→ classification

tasks

Directed graph models and Neural networks

Review: An Introduction to VAE 3

Directed probabilistic models, also known as directed probabilistic graphical
models (PGMs) or Bayesian networks structure variables in a directed acyclic
graph (DAG), where each variable’s joint distribution is represented as a
product of prior and conditional distributions:

M
pθ (x1 , … , xM ) = ∏ pθ (xj ∣Pa(xj ))

j=1

Pa(xj )represents the set of parent variables for each node j in the graph

Root nodes have no parents, so their distributions are unconditional

Neural networks provide a flexible approach by taking the parent variables of

a node as input and producing distributional parameters, η, for that variable:

η = NeuralNet(Pa(x))
pθ (x∣Pa(x)) = pθ (x∣η)

Learning in fully observed models with Neural nets

Dataset
Dataset D typically consists of N ≥ 1datapoints:

D = {x(1) , x(2) , … , x(N ) } ≡ {x(i) }N

i=1 ≡ x
(1:N )

Each datapoint is an independent sample from a consistent underlying

distribution (the dataset is composed of distinct, independent measurements
from a stable system) → the observations D = {x(i) }N
i=1 are independently

and identically distributed

The probability of the dataset given the model parameters, θ, can be
expressed as a product of individual datapoint probabilities. The log-
probability assigned to the data by the model is thus: log pθ (D) =

∑x∈D log pθ (x)

Maximum likelihood and Minibatch SGD

Review: An Introduction to VAE 4

Maximizing log-likelihood equates to minimizing the Kullback-Leibler (KL)
divergence between the data distribution and the model's distribution →
seeks parameters θthat maximize the sum log-probability of the dataset D (
log pθ (D) = ∑x∈D log pθ (x) )

Stochastic Gradient Descent (SGD) use minibatches M ⊂ Dof size NM →

create an unbiased estimator (≃) of the log-probability:

1 1 1
log pθ (D) ≃
log pθ (M) =
∑ log pθ (x)

ND NM NM

x∈M

The stochastic gradient, represented as:

1 1 1
∇θ log pθ (D) ≃
∇θ log pθ (x) =
∑ ∇θ log pθ (x)

ND NM NM

x∈M

⇒ optimizes the objective function by iteratively adjusting the model parameters

according to the direction of the stochastic gradient

Bayesian inference
Improve through:

Maximum a posteriori (MAP)

Inference of a full approximate posterior distribution over the parameters

Learning and inference in deep latent variable models

Review: An Introduction to VAE 5

Latent Variable Model - Notes
No ratings yet
Latent Variable Model - Notes
11 pages
Intro to Variational Autoencoders
No ratings yet
Intro to Variational Autoencoders
89 pages
Introduction to Variational Autoencoders
No ratings yet
Introduction to Variational Autoencoders
89 pages
MuskanSharma - III IT
No ratings yet
MuskanSharma - III IT
10 pages
Deep Gen Models Tutorial
No ratings yet
Deep Gen Models Tutorial
96 pages
Variational Autoencoders Overview
No ratings yet
Variational Autoencoders Overview
37 pages
VAEs Talk
No ratings yet
VAEs Talk
44 pages
Unbiased DL of Deep Generative Models With Structurd Discrete Representation
No ratings yet
Unbiased DL of Deep Generative Models With Structurd Discrete Representation
38 pages
Lecture # 6 Latent Variable Models
No ratings yet
Lecture # 6 Latent Variable Models
55 pages
Chapter 5
No ratings yet
Chapter 5
140 pages
Quadrant Data Efficient Machine Learning Screen
No ratings yet
Quadrant Data Efficient Machine Learning Screen
6 pages
Unit 5
100% (1)
Unit 5
36 pages
Understanding Variational Autoencoders
No ratings yet
Understanding Variational Autoencoders
18 pages
Inherent Stochasticity
No ratings yet
Inherent Stochasticity
12 pages
Blaauw16 Interspeech
No ratings yet
Blaauw16 Interspeech
5 pages
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
No ratings yet
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
51 pages
Tutorial On Diffusion Models
No ratings yet
Tutorial On Diffusion Models
4 pages
Lecture 12 Bayesian Neural Network
No ratings yet
Lecture 12 Bayesian Neural Network
46 pages
L11 - UCLxDeepMind DL2020
No ratings yet
L11 - UCLxDeepMind DL2020
68 pages
Gen AI Unit 2
100% (1)
Gen AI Unit 2
65 pages
Unit 3
No ratings yet
Unit 3
27 pages
Mai 2
No ratings yet
Mai 2
39 pages
Adversarial Variational Bayes
No ratings yet
Adversarial Variational Bayes
14 pages
465-Lecture 12
No ratings yet
465-Lecture 12
31 pages
Auto Encoder S
No ratings yet
Auto Encoder S
22 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
Synthetic ECG Generation For Data Augmentation and Transfer Learning in Arrhythmia Classification
No ratings yet
Synthetic ECG Generation For Data Augmentation and Transfer Learning in Arrhythmia Classification
23 pages
DL1 Ver1
No ratings yet
DL1 Ver1
49 pages
Lec15 Generative Models
No ratings yet
Lec15 Generative Models
51 pages
Variational Autoencoder Explanation
No ratings yet
Variational Autoencoder Explanation
11 pages
P A S - L S - R VAE: Erformance Nalysis of EMI Supervised Earning in THE Mall Data Egime Using S
No ratings yet
P A S - L S - R VAE: Erformance Nalysis of EMI Supervised Earning in THE Mall Data Egime Using S
7 pages
Diffusion Models for Students
No ratings yet
Diffusion Models for Students
89 pages
Part 15 MD
No ratings yet
Part 15 MD
36 pages
Generalization of VAE
No ratings yet
Generalization of VAE
30 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
Summary
No ratings yet
Summary
44 pages
A Tutorial On Deep Latent Variable Models of Natural Language
No ratings yet
A Tutorial On Deep Latent Variable Models of Natural Language
48 pages
Flow Based Deep Generative Models Report
No ratings yet
Flow Based Deep Generative Models Report
12 pages
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
No ratings yet
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
9 pages
Variational Autoencoders Explained
No ratings yet
Variational Autoencoders Explained
44 pages
Unsupervised Deep Learning
No ratings yet
Unsupervised Deep Learning
11 pages
AVAE
No ratings yet
AVAE
21 pages
Unit 5 Deep Unsupervised Learning
No ratings yet
Unit 5 Deep Unsupervised Learning
30 pages
Module 2 Gen
No ratings yet
Module 2 Gen
57 pages
Statistical Mechanics of Deep Learning
No ratings yet
Statistical Mechanics of Deep Learning
30 pages
GAPE Module 3
No ratings yet
GAPE Module 3
21 pages
AML Important Topics
No ratings yet
AML Important Topics
9 pages
Week 12 Foundations of Generative AIv2 2
No ratings yet
Week 12 Foundations of Generative AIv2 2
74 pages
Diffusion Models in Imaging Tutorial
No ratings yet
Diffusion Models in Imaging Tutorial
90 pages
01 - Introduction To Deep Learning
No ratings yet
01 - Introduction To Deep Learning
56 pages
Cheatsheets For Deep Learning 1650192034
No ratings yet
Cheatsheets For Deep Learning 1650192034
95 pages
Lecture 5 Autoregressive Models
No ratings yet
Lecture 5 Autoregressive Models
41 pages
Autoencoder NPTEL Presentation
No ratings yet
Autoencoder NPTEL Presentation
11 pages
Enhancing Latent Variable Models
No ratings yet
Enhancing Latent Variable Models
11 pages
Generative Learning Algorithims 1233
No ratings yet
Generative Learning Algorithims 1233
33 pages
Understanding Deep Learning Concepts
No ratings yet
Understanding Deep Learning Concepts
78 pages
DL Unit 1
No ratings yet
DL Unit 1
9 pages
17163-Article Text-20657-1-2-20210518
No ratings yet
17163-Article Text-20657-1-2-20210518
9 pages
Future Impact of Quantum Computing
No ratings yet
Future Impact of Quantum Computing
19 pages
Full Form of Networking Terms
60% (5)
Full Form of Networking Terms
2 pages
Ddar FW Log
No ratings yet
Ddar FW Log
27 pages
Asia D. Wright: Summary of Qualifications
No ratings yet
Asia D. Wright: Summary of Qualifications
3 pages
Srds Class 10 Maths Assertion Reason Mastery
No ratings yet
Srds Class 10 Maths Assertion Reason Mastery
36 pages
Conflict Series 1 What Is Conflict PTP201901E
No ratings yet
Conflict Series 1 What Is Conflict PTP201901E
4 pages
PU Foam Cutters for Industry Pros
No ratings yet
PU Foam Cutters for Industry Pros
4 pages
OCEANLOTUS Threat Analysis
No ratings yet
OCEANLOTUS Threat Analysis
13 pages
Fi3005 Operators Manual
No ratings yet
Fi3005 Operators Manual
32 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
9 pages
Beginning Fly Tying
No ratings yet
Beginning Fly Tying
268 pages
SRM MCA (Online) Brochure
No ratings yet
SRM MCA (Online) Brochure
10 pages
6439521511bcd8941bf2f6ef - Supplier Audit Form
No ratings yet
6439521511bcd8941bf2f6ef - Supplier Audit Form
45 pages
Acrobat Could Not Access The Recognition Service When A Empting OCR On Windows
No ratings yet
Acrobat Could Not Access The Recognition Service When A Empting OCR On Windows
4 pages
The Telecom Sutras - Dedicated To Telecomwallahs of India
No ratings yet
The Telecom Sutras - Dedicated To Telecomwallahs of India
64 pages
Mpalt006 Lifetech Modular Instruction Asia en
No ratings yet
Mpalt006 Lifetech Modular Instruction Asia en
8 pages
Storcenter Ix2
No ratings yet
Storcenter Ix2
137 pages
Conductivity Sensor
No ratings yet
Conductivity Sensor
2 pages
Mobilehci25adjunct 18
No ratings yet
Mobilehci25adjunct 18
4 pages
Fda 21 CFR 11-820
No ratings yet
Fda 21 CFR 11-820
7 pages
Maxsurf Pro 14 Tutorial: Charles Dorger September 27, 2009
No ratings yet
Maxsurf Pro 14 Tutorial: Charles Dorger September 27, 2009
15 pages
IE2108 Tutorial 01
No ratings yet
IE2108 Tutorial 01
6 pages
Computer Hardware Product List
No ratings yet
Computer Hardware Product List
42 pages
Cantadora - Internal Resource Base 2018 - Academics@Srishti
No ratings yet
Cantadora - Internal Resource Base 2018 - Academics@Srishti
8 pages
Richard Hoover 298-52-1176 (419) - 626-4770 CA310121092220870206 02/23/2021 215959724 E-Mail
No ratings yet
Richard Hoover 298-52-1176 (419) - 626-4770 CA310121092220870206 02/23/2021 215959724 E-Mail
8 pages
TC Electronic Finalizer 96k Manual English 240702 203046
No ratings yet
TC Electronic Finalizer 96k Manual English 240702 203046
56 pages
Assam E Xam: Police
No ratings yet
Assam E Xam: Police
18 pages
Osi Pi Use Case.
No ratings yet
Osi Pi Use Case.
5 pages
EMV Reading PAN Code - Apdu
No ratings yet
EMV Reading PAN Code - Apdu
4 pages
Functional Decomposition Diagram
No ratings yet
Functional Decomposition Diagram
11 pages