0% found this document useful (0 votes)

31 views44 pages

Autoencoders U

Autoencoders are neural networks designed to copy their inputs to outputs by compressing data into a latent-space representation through an encoder and reconstructing it via a decoder. They can be classified into various types, including vanilla, multilayer, convolutional, and regularized autoencoders, each with unique architectures and training methods. The purpose of autoencoders is to learn useful features from data, particularly when constraints on the latent representation are applied, such as in undercomplete or denoising autoencoders.

Uploaded by

bhavesh pariyani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views44 pages

Autoencoders U

Uploaded by

bhavesh pariyani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Autoencoders

1
Introduction [1]
 Autoencoders (AE) are neural networks that aims
to copy their inputs to their outputs.

2
Introduction [1]
 Autoencoders (AE) are neural networks that aims
to copy their inputs to their outputs.

They work by compressing the input into a latent-

space representation, and then reconstructing the
output from this representation.

3
Introduction [1]
 This kind of network is composed of two parts:

4
Introduction [1]
 This kind of network is composed of two parts:

 Encoder: This is the part of the network that compresses the input
into a latent-space representation. It can be represented by an
encoding function h=f(x).

5
Introduction [1]
 This kind of network is composed of two parts:

6
Introduction [1]
 This kind of network is composed of two parts:

 Encoder: This is the part of the network that compresses the input
into a latent-space representation. It can be represented by an
encoding function h=f(x).
 Decoder: This part aims to reconstruct the input from the latent
space representation. It can be represented by a decoding function
r=g(h).
The autoencoder as a whole can thus be described by the function 7
g(f(x)) = r where you want r as close as the original input x.
Why copying the input to the output ? [1]
 If the only purpose of autoencoders was to copy
the input to the output, they would be useless.

8
Why copying the input to the output ? [1]
 If the only purpose of autoencoders was to copy
the input to the output, they would be useless.

 Indeed, we hope that, by training the autoencoder

to copy the input to the output, the latent
representation h will take on useful properties.

9
Why copying the input to the output ? [1]
 This can be achieved by creating constraints on
the copying task.

10
Why copying the input to the output ? [1]
 This can be achieved by creating constraints on
the copying task.

 One way to obtain useful features from the

autoencoder is to constrain h to have smaller
dimensions than x, in this case the autoencoder is
called undercomplete.

11
Why copying the input to the output ? [1]
 This can be achieved by creating constraints on
the copying task.

 One way to obtain useful features from the

autoencoder is to constrain h to have smaller
dimensions than x, in this case the autoencoder is
called undercomplete.

 By training an undercomplete representation, we

force the autoencoder to learn the most salient
features of the training data.

12
Why copying the input to the output ? [1]
 This can be achieved by creating constraints on
the copying task.

 One way to obtain useful features from the

autoencoder is to constrain h to have smaller
dimensions than x, in this case the autoencoder is
called undercomplete.

 By training an undercomplete representation, we

force the autoencoder to learn the most salient
features of the training data.

 If the autoencoder is given too much capacity, it

can learn to perform the copying task without
extracting any useful information about the
distribution of the data. 13
Why copying the input to the output ? [1]
This can also occur if the dimension of the latent
representation is the same as the input, and in the
overcomplete case, where the dimension of the
latent representation is greater than the input.

14
Why copying the input to the output ? [1]
This can also occur if the dimension of the latent
representation is the same as the input, and in the
overcomplete case, where the dimension of the
latent representation is greater than the input.

 In these cases, even a linear encoder and linear

decoder can learn to copy the input to the output
without learning anything useful about the data
distribution.

15
Why copying the input to the output ? [1]
This can also occur if the dimension of the latent
representation is the same as the input, and in the
overcomplete case, where the dimension of the
latent representation is greater than the input.

 In these cases, even a linear encoder and linear

decoder can learn to copy the input to the output
without learning anything useful about the data
distribution.

 Ideally, one could train any architecture of

autoencoder successfully, choosing the code
dimension and the capacity of the encoder and
decoder based on the complexity of distribution to
be modelled. 16
Types of Autoencoders [1]
 Vanilla autoencoder
 Multilayer autoencoder
 Convolutional autoencoder
 Regularized autoencoder

17
Types of Autoencoders [1]
 Vanilla autoencoder
 In its simplest form, the autoencoder is a three layers
net, i.e. a neural net with one hidden layer.

 The input and output are the same, and we learn how to
reconstruct the input, for example using the adam
optimizer and the mean squared error loss function.

18
Types of Autoencoders [1]
 Vanilla autoencoder

autoencoder.fit(x_train, x_train, epochs=5)

reconstructed = autoencoder.predict(x_test)
19
Types of Autoencoders [2]
 Vanilla autoencoder
 Practical Advise:
 We have total control over the architecture of the
autoencoder.

 We can make it very powerful by increasing the number of

layers, nodes per layer and most importantly the code size.

 Increasing these hyperparameters will let the autoencoder

to learn more complex codings.

 But we should be careful to not make it too powerful.

 Otherwise the autoencoder will simply learn to copy its

inputs to the output, without learning any meaningful
representation.

 It will just mimic the identity function.

 The autoencoder will reconstruct the training data

perfectly, but it will be overfitting without being able to
generalize to new instances, which is not what we want. 20
Types of Autoencoders [2]
 Vanilla autoencoder
 Practical Advise:
 This is why we prefer a “sandwitch” architecture, and
deliberately keep the code size small.

 Since the coding layer has a lower dimensionality than the

input data, the autoencoder is said to be undercomplete.

 It won’t be able to directly copy its inputs to the output,

and will be forced to learn intelligent features.

 If the input data has a pattern, for example the digit “1”
usually contains a somewhat straight line and the digit “0” is
circular, it will learn this fact and encode it in a more compact
form.

 If the input data was completely random without any

internal correlation or dependency, then an undercomplete
autoencoder won’t be able to recover it perfectly.

 But luckily, in the real-world there is a lot of

21
dependency.
Types of Autoencoders [1]
 Multilayer autoencoder

autoencoder.fit(x_train, x_train, epochs=5)

reconstructed = autoencoder.predict(x_test)
22
Types of Autoencoders [1]
 Multilayer autoencoder
 Now our implementation uses 3 hidden layers instead of
just one.

 Any of the hidden layers can be picked as the feature

representation but we will make the network symmetrical
and use the middle-most layer.

23
Types of Autoencoders [1]
 Convolutional autoencoder

Notice: padding=“valid”

autoencoder.fit(x_train, x_train, epochs=5)

reconstructed = autoencoder.predict(x_test) 24
Types of Autoencoders [1]
 Regularized autoencoder
 There are other ways we can constraint the
reconstruction of an autoencoder than to impose a hidden
layer of smaller dimension than the input.

 Rather than limiting the model capacity by keeping the

encoder and decoder shallow and the code size small,
regularized autoencoders use a loss function that
encourages the model to have other properties besides the
ability to copy its input to its output.

 In practice, we usually find two types of regularized

autoencoder:
 the sparse autoencoder and
 the denoising autoencoder.

25
Types of Autoencoders
 Sparse autoencoder
 Please recall, when we apply weight regularizer, it is like
adding a term in loss function like 𝑖 𝑗 |𝑤𝑖𝑗 | for L1
regularization and 𝑖 𝑗 𝑤𝑖𝑗 for L2 regularization. This
2

ensures that weights are very small and therefore our

model is simple and we can avoid overfitting.

 Here, in sparse autoencoder, we regularize output (that

is activations) of neurons and therefore they are small and
many are zero leading to a sparse representation.

26
Types of Autoencoders [1]
 Sparse autoencoder

autoencoder.fit(x_train, x_train, epochs=5)

27
reconstructed = autoencoder.predict(x_test)
Types of Autoencoders [2]
 Denoising autoencoder
 Keeping the code layer small forced our autoencoder to
learn an intelligent representation of the data.

 There is another way to force the autoencoder to learn

useful features, which is adding random noise to its inputs
and making it recover the original noise-free data.

 This way the autoencoder can’t simply copy the input to

its output because the input also contains random noise.

 We are asking it to subtract the noise and produce the

underlying meaningful data. This is called a denoising
autoencoder.

28
Types of Autoencoders [2]
 Denoising autoencoder

 The top row contains the original images.

 We add random Gaussian noise to them and the noisy
data becomes the input to the autoencoder.

 The autoencoder doesn’t see the original image at all.

 But then we expect the autoencoder to regenerate the

noise-free original image.
29
Types of Autoencoders [2]
 Denoising autoencoder

 There is only one small difference between the

implementation of denoising autoencoder and the regular
one. The architecture doesn’t change at all, only the fit
function.

 We trained the regular autoencoder as follows:

autoencoder.fit(x_train, x_train)

 Denoising autoencoder is trained as:

autoencoder.fit(x_train_noisy, x_train) 30
Types of Autoencoders [2]

 Denoising Autoencoders

31
Stacked Autoencoders [3]
 Suppose you wished to train a stacked autoencoder with 2
hidden layers for classification of MNIST digits.
 First, you would train a sparse autoencoder on the raw inputs
x(k) to learn primary features h(1)(k) on the raw input.

32
Stacked Autoencoders [3]
 Next, you would feed the raw input into this trained sparse
autoencoder, obtaining the primary feature activations h(1)(k)
for each of the inputs x(k).
 You would then use these primary features as the "raw input"
to another sparse autoencoder to learn secondary features
h(2)(k) on these primary features.

33
Stacked Autoencoders [3]
 Following this, you would feed the primary features into the
second sparse autoencoder to obtain the secondary feature
activations h(2)(k) for each of the primary features h(1)(k)
(which correspond to the primary features of the
corresponding inputs x(k)).

 You would then treat these secondary features as "raw

input" to a softmax classifier, training it to map secondary
features to digit labels.

34
Stacked Autoencoders [3]
 Finally, you would combine all three layers together to form a
stacked autoencoder with 2 hidden layers and a final softmax
classifier layer capable of classifying the MNIST digits as
desired.

35
Stacked Autoencoders [3]
 A stacked autoencoder enjoys all the benefits of any deep
network of greater expressive power.

Further, it often captures a useful "hierarchical grouping" or

"part-whole decomposition" of the input.

 To see this, recall that an autoencoder tends to learn

features that form a good representation of its input.

The first layer of a stacked autoencoder tends to learn first-

order features in the raw input (such as edges in an image).

The second layer of a stacked autoencoder tends to learn

second-order features corresponding to patterns in the
appearance of first-order features (e.g., in terms of what
edges tend to occur together--for example, to form contour or
corner detectors).

Higher layers of the stacked autoencoder tend to learn even

higher-order features. 36
Stacked Autoencoders [4]

37
Stacked Autoencoders [4]

38
Fine Tuning Stacked Autoencoders [4]

Fine Tuning

39
Stacked Denoising Autoencoders [4]

40
Stacked Denoising Autoencoders [4]

41
Stacked Denoising Autoencoders [4]

42
References
1. https://towardsdatascience.com/deep-inside-
autoencoders-7e41f319999f
2. https://towardsdatascience.com/applied-deep-learning-
part-3-autoencoders-1c083af4d798
3. http://ufldl.stanford.edu/wiki/index.php/Stacked_Autoe
ncoders
4. https://in.mathworks.com/help/deeplearning/examples/tr
ain-stacked-autoencoders-for-image-classification.html
Disclaimer
 These slides are not original and have been
prepared from various sources for teaching
purpose.

Autoencoders and Generative Models Overview
No ratings yet
Autoencoders and Generative Models Overview
25 pages
Autoencoders
No ratings yet
Autoencoders
12 pages
D5 PPT
No ratings yet
D5 PPT
79 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
Unit5 Autoencoders
No ratings yet
Unit5 Autoencoders
45 pages
Unit 5e - Autoencoders
No ratings yet
Unit 5e - Autoencoders
32 pages
Autoencoders
No ratings yet
Autoencoders
14 pages
Auto Encoder
No ratings yet
Auto Encoder
10 pages
DL Unit 2B
No ratings yet
DL Unit 2B
23 pages
Unit 3
No ratings yet
Unit 3
39 pages
Module 3 DL
No ratings yet
Module 3 DL
103 pages
Autoencoder - Unit 4
No ratings yet
Autoencoder - Unit 4
39 pages
Lecture 23b Auto Encoder
No ratings yet
Lecture 23b Auto Encoder
27 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
Introduction To Autoencoders: A Brief Overview
No ratings yet
Introduction To Autoencoders: A Brief Overview
27 pages
L23 Autoencoders
No ratings yet
L23 Autoencoders
16 pages
Lecture 14 Autoencoders
No ratings yet
Lecture 14 Autoencoders
39 pages
Unit4 1
No ratings yet
Unit4 1
42 pages
DeepLearning Unit IV Notes
No ratings yet
DeepLearning Unit IV Notes
58 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
Auto Encoders
No ratings yet
Auto Encoders
4 pages
Stochastic Autoencoders in Deep Learning
No ratings yet
Stochastic Autoencoders in Deep Learning
42 pages
Chapter 7 - Autoencoders
No ratings yet
Chapter 7 - Autoencoders
91 pages
Module 03
No ratings yet
Module 03
13 pages
Experiment 4
No ratings yet
Experiment 4
26 pages
Autoencoders
No ratings yet
Autoencoders
4 pages
Chapter17 Autoencoders
No ratings yet
Chapter17 Autoencoders
23 pages
Unit II
No ratings yet
Unit II
35 pages
Autoencoders: Types and Applications
No ratings yet
Autoencoders: Types and Applications
91 pages
DL M3 Tech
No ratings yet
DL M3 Tech
15 pages
Autoencoders in Machine Learning
No ratings yet
Autoencoders in Machine Learning
7 pages
Unit V
No ratings yet
Unit V
32 pages
Autoencoders: Neural Network Guide
No ratings yet
Autoencoders: Neural Network Guide
20 pages
ML Lec 19 Autoencoder
No ratings yet
ML Lec 19 Autoencoder
54 pages
Deep Learning Module-2 & 4
No ratings yet
Deep Learning Module-2 & 4
48 pages
Brief Introduction On Current Research Areas - Autoencoders
No ratings yet
Brief Introduction On Current Research Areas - Autoencoders
20 pages
Autoencoders
No ratings yet
Autoencoders
103 pages
Autoencoders: Applications & Types
No ratings yet
Autoencoders: Applications & Types
21 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
22 pages
Unit 4
No ratings yet
Unit 4
10 pages
Unit IV - Part 01
No ratings yet
Unit IV - Part 01
47 pages
DUnit IV
No ratings yet
DUnit IV
22 pages
Deep Learning Autoencoders
No ratings yet
Deep Learning Autoencoders
31 pages
Unit-5 Auto Encoders in Deep Learning
No ratings yet
Unit-5 Auto Encoders in Deep Learning
23 pages
DL Unit 4
No ratings yet
DL Unit 4
21 pages
Unit-V DL
No ratings yet
Unit-V DL
31 pages
DL - Module 3
No ratings yet
DL - Module 3
62 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
Unit 3
No ratings yet
Unit 3
23 pages
Vae Gan
No ratings yet
Vae Gan
214 pages
CVDL Cae 2
No ratings yet
CVDL Cae 2
7 pages
Autoencoders & GANs: Concepts & Implementation
No ratings yet
Autoencoders & GANs: Concepts & Implementation
138 pages
Autoencoders
No ratings yet
Autoencoders
35 pages
Autoencoders in Generative Models
No ratings yet
Autoencoders in Generative Models
65 pages
Dlunit 4
No ratings yet
Dlunit 4
122 pages
Swarup Dey
No ratings yet
Swarup Dey
14 pages
Approximation Algorithms
No ratings yet
Approximation Algorithms
6 pages
Hydroelectric Operating Technician Trainee - Niagara, ON, CA, L0S 1L0
No ratings yet
Hydroelectric Operating Technician Trainee - Niagara, ON, CA, L0S 1L0
3 pages
Mumbai
No ratings yet
Mumbai
7 pages
Size Structured Population Models of Dap
No ratings yet
Size Structured Population Models of Dap
8 pages
01 Tables DI Udit Sir
No ratings yet
01 Tables DI Udit Sir
11 pages
Instructional Supervisory Plan August 2024
No ratings yet
Instructional Supervisory Plan August 2024
5 pages
Design of Slab
No ratings yet
Design of Slab
3 pages
SRYLED P4.81 Outdoor 1R1G1B 3in1 SMD1921 Full Color LED Display (500x500mm Cabinet)
No ratings yet
SRYLED P4.81 Outdoor 1R1G1B 3in1 SMD1921 Full Color LED Display (500x500mm Cabinet)
4 pages
Analisis Regresi Linier Berganda
No ratings yet
Analisis Regresi Linier Berganda
75 pages
Employees' Compensation Act, 1973
No ratings yet
Employees' Compensation Act, 1973
15 pages
Chapter 10 Audit Reports
No ratings yet
Chapter 10 Audit Reports
7 pages
Travel Card N Multicurrency Application Form
No ratings yet
Travel Card N Multicurrency Application Form
4 pages
Director's Personal Liability - Hindrance To Business Decision?
No ratings yet
Director's Personal Liability - Hindrance To Business Decision?
3 pages
HeadgearX-5
No ratings yet
HeadgearX-5
4 pages
Engine Pedestal Vibration Solutions with TMDs
100% (1)
Engine Pedestal Vibration Solutions with TMDs
22 pages
Thompson Etal 1999 SEG Alteration-Mapping-in-Exploration
No ratings yet
Thompson Etal 1999 SEG Alteration-Mapping-in-Exploration
13 pages
Manage Meetings Assessment Guide
No ratings yet
Manage Meetings Assessment Guide
11 pages
Community Based Livestock Breeding Progr
No ratings yet
Community Based Livestock Breeding Progr
14 pages
Valu-Based Pricing - FAQ
No ratings yet
Valu-Based Pricing - FAQ
2 pages
Online Faculty Guide
No ratings yet
Online Faculty Guide
11 pages
TegraK1 TRM DP06905001 Public v03p
100% (1)
TegraK1 TRM DP06905001 Public v03p
2,535 pages
Solution Manual For Employee Training and Development 8th Edition Raymond Noe Kindle & PDF Formats
100% (8)
Solution Manual For Employee Training and Development 8th Edition Raymond Noe Kindle & PDF Formats
129 pages
Bank Auditing (BLACK BOOK)
80% (25)
Bank Auditing (BLACK BOOK)
56 pages
Safety Data Sheet SDS: Product: Polyester Resin July 1, 2019
No ratings yet
Safety Data Sheet SDS: Product: Polyester Resin July 1, 2019
7 pages
Unit 7 Termination and Dismissal
No ratings yet
Unit 7 Termination and Dismissal
36 pages
Chapter 8 - Unenforceable Contracts
No ratings yet
Chapter 8 - Unenforceable Contracts
4 pages
Upsc Civil Engineering QSTN
No ratings yet
Upsc Civil Engineering QSTN
13 pages
AC Power
No ratings yet
AC Power
10 pages
Examination Rules For Students
No ratings yet
Examination Rules For Students
1 page

Autoencoders U

Uploaded by

Autoencoders U

Uploaded by

Autoencoders

They work by compressing the input into a latent-

 Indeed, we hope that, by training the autoencoder

 One way to obtain useful features from the

 One way to obtain useful features from the

 By training an undercomplete representation, we

 One way to obtain useful features from the

 By training an undercomplete representation, we

 If the autoencoder is given too much capacity, it

 In these cases, even a linear encoder and linear

 In these cases, even a linear encoder and linear

 Ideally, one could train any architecture of

autoencoder.fit(x_train, x_train, epochs=5)

 We can make it very powerful by increasing the number of

 Increasing these hyperparameters will let the autoencoder

 But we should be careful to not make it too powerful.

 Otherwise the autoencoder will simply learn to copy its

 It will just mimic the identity function.

 The autoencoder will reconstruct the training data

 Since the coding layer has a lower dimensionality than the

 It won’t be able to directly copy its inputs to the output,

 If the input data was completely random without any

 But luckily, in the real-world there is a lot of

autoencoder.fit(x_train, x_train, epochs=5)

 Any of the hidden layers can be picked as the feature

autoencoder.fit(x_train, x_train, epochs=5)

 Rather than limiting the model capacity by keeping the

 In practice, we usually find two types of regularized

ensures that weights are very small and therefore our

 Here, in sparse autoencoder, we regularize output (that

autoencoder.fit(x_train, x_train, epochs=5)

 There is another way to force the autoencoder to learn

 This way the autoencoder can’t simply copy the input to

 We are asking it to subtract the noise and produce the

 The top row contains the original images.

 The autoencoder doesn’t see the original image at all.

 But then we expect the autoencoder to regenerate the

 There is only one small difference between the

 We trained the regular autoencoder as follows:

 Denoising autoencoder is trained as:

 You would then treat these secondary features as "raw

Further, it often captures a useful "hierarchical grouping" or

 To see this, recall that an autoencoder tends to learn

The first layer of a stacked autoencoder tends to learn first-

The second layer of a stacked autoencoder tends to learn

Higher layers of the stacked autoencoder tend to learn even

You might also like