Recent Advances of Generative Adversarial
Networks in Computer Vision
SREEJITH PB (PKD16IT053)
Guided By
Sibily Joseph and Joby NJ
Asst. Professors
Department of Computer Science and Engineering
GOVERNMENT ENGINNERING COLLEGE, SREEKRISHNAPURAM
September 2019
GEC SREEKRISHNAPURAM GAN 1 / 34
CONTENTS
• Introduction
• System Overview
• Types of GAN
• Applications
• Conclusion
• References
GEC SREEKRISHNAPURAM GAN 2 / 34
Introduction
GEC SREEKRISHNAPURAM GAN 3 / 34
Introduction
• Generative Adversarial Network (GAN), a generative approach
proposed by Goodfellow in 2014 has become one of the most
discussed topics in machine learning
• Generative Adversarial Network can
• Generate high quality images
• Generate high quality audios and videos
• Generate images from text
• Convert images from one domain to another(Image translation)
• etc.
• Different types of GANs are available now for various
application.
GEC SREEKRISHNAPURAM GAN 4 / 34
Introduction
GEC SREEKRISHNAPURAM GAN 5 / 34
System Overview
• Generative adversarial networks (GANs) are deep neural net
architectures comprised of two networks Generator(D) and
Discriminator, pitting one against the other (thus the
adversarial)
• Working of GAN
.
GEC SREEKRISHNAPURAM GAN 6 / 34
System Overview
• The Generator takes in random noise and returns an image.
• This generated image is fed into the Discriminator alongside a
stream of images taken from the actual data set.
• The Discriminator takes in both real and fake images and
returns probabilities, a number between 0 and 1, with 1
representing a prediction of authenticity and 0 representing
fake
• The entities/adversaries are in constant battle as
one(generator) tries to fool the other(discriminator),
while the other tries not to be fooled.
GEC SREEKRISHNAPURAM GAN 7 / 34
System Overview
Two Feedback Loops:
• The Discriminator is in a feedback loop with the ground truth
of the images (are they real or fake)
• The Generator is in a feedback loop with the Discriminator
(did the Discriminator label it real or fake, regardless of the
truth)
GEC SREEKRISHNAPURAM GAN 8 / 34
System Overview
Discriminator vs Generator
GEC SREEKRISHNAPURAM GAN 9 / 34
System Overview
GEC SREEKRISHNAPURAM GAN 10 / 34
System Overview
Loss Function In Discriminative Model
Loss function in Discriminative Model is a regular cross entropy
loss function associated with a binary classifier.
P can be represented as D(x); ie, Probability estimated by
Discriminator D that image X is real image.
GEC SREEKRISHNAPURAM GAN 11 / 34
System Overview
Applying Gradient descent algorithm for minimizing the loss
function
the equation becomes
Loss Function In Generative Model
GEC SREEKRISHNAPURAM GAN 12 / 34
System Overview
GEC SREEKRISHNAPURAM GAN 13 / 34
System Overview
Advantages Over VAE
• GAN belongs to the type of non-parametric production-based
modeling methods, which does not require prior approximate
distributions of training
• GAN works on the whole image and takes less time to
generate samples by directly using global information
GEC SREEKRISHNAPURAM GAN 14 / 34
System Overview
GAN Problems
• Non-convergence:The model parameters oscillate, destabilize
and never converge.
• Mode collapse:The generator collapses which produces limited
varieties of samples.
• Diminished gradient: the discriminator gets too successful
that the generator gradient vanishes and learns nothing.
• Unbalance between the generator and discriminator causing
overfitting
• Highly sensitive to the hyperparameter selections.
GEC SREEKRISHNAPURAM GAN 15 / 34
Types Of GAN
1.DCGAN(Deep Convolutional GAN)
• The generator and discriminator of simple GAN is a simple
fully connected network
generator=Sequential([
Dense(128,inputshape=(100,)),
LeakyReLu(alpha-0.01),
Dense(784),
Activation(’tanh’),
],name=’generator’)
• But in DCGAN Discriminator is a Convolutional Nueral
Network (CNN) and Generator is Transposed Convolutional
Network(Deconvolutional network)
• ie DCGAN will be more fit for the image/video data than a
Simple GAN
GEC SREEKRISHNAPURAM GAN 16 / 34
Types Of GAN(DCGAN cont..)
Similarities Of Neural Networks And CNN
• Both Nueral Network and CNN have learn able weights and
biases.
• In both networks nueron receives some input,perform a dot
product follows it up with a non linear function like
RELU(Rectified Linear Unit)
Main problems with fully connected layers
• Number of weights needed for the nueral network is large
• Networks with large number of parameters faces several
problems.
• slower training time
• chances of overfitting
• etc..
GEC SREEKRISHNAPURAM GAN 17 / 34
Types Of GAN(DCGAN cont..)
Convolutional Neural Network(CNN)
• In CNN the main image matrix is reduced to a matrix of lower
dimension in the first layer through an operation called
Convolution
eg:an image of 64x64xx3 can be reduced to 1x1x10 following
subsequent operation.
Figure: Architecture of Convolutional Neural Network
GEC SREEKRISHNAPURAM GAN 18 / 34
Types Of GAN(DCGAN cont..)
Convolutional Layer
GEC SREEKRISHNAPURAM GAN 19 / 34
Types Of GAN(DCGAN cont..)
Max pooling
Figure: Max Pooling
GEC SREEKRISHNAPURAM GAN 20 / 34
Types Of GAN(DCGAN cont..)
Figure: Discriminator
GEC SREEKRISHNAPURAM GAN 21 / 34
Types Of GAN(DCGAN cont..)
Figure: Generator
GEC SREEKRISHNAPURAM GAN 22 / 34
Types Of GAN
2.CGAN(Conditional GAN)
• when the data set is complex or large-scale, it is difficult for
GAN to control generated result.
• Conditional GANs (CGANs) are an extension of the GANs
model.
• In CGAN the Generator and Discriminator both receive some
additional conditioning input information(y). This could be
the class of the current image or some other property.
NOTE: CGANs have one disadvantage. CGANs are not strictly
unsupervised and we need some kind of labels for them to work
GEC SREEKRISHNAPURAM GAN 23 / 34
Types Of GAN
3.CYCLE GAN
• The CycleGAN is an extension of the GAN architecture that
involves the simultaneous training of two generator models
and two discriminator models.
• The CycleGAN is a technique that involves the automatic
training of image-to-image translation models without paired
examples.
• The models are trained in an unsupervised manner using a
collection of images from the source and target domain that
do not need to be related in any way.
GEC SREEKRISHNAPURAM GAN 24 / 34
Types Of GAN (CYCLEGAN cont...)
GEC SREEKRISHNAPURAM GAN 25 / 34
Types Of GAN
4.SEQGAN(Sequential GAN)
• In sequential data (text, speech, etc), there are some
limitations in applying the exact same concepts of GAN.
These limitations arise mainly due to the sequential and
discrete nature of the data.
• This is the image representation of a random matrix (M)
GEC SREEKRISHNAPURAM GAN 26 / 34
Types Of GAN(SEQGAN cont...)
• This is the image representation of M+0.08
• .But in case of a text ,Suppose that the word computer is
represented by the real-valued vector v = [0.11143, -0.97712,
0.445216 .., 0.7221240]. Now, v + 0.08 is another vector
which need not necessarily represent some word in the
vocabulary.
• eg:”penguin”+0.001==¿”ostrich”
GEC SREEKRISHNAPURAM GAN 27 / 34
Types Of GAN(SEQGAN cont...)
• To overcome,Goodfellow( father of GAN )recommended to
use Reinforcement learning to train GAN to generate discrete
tokens.
• SeqGan(Sequence Generative Adversarial Nets) Using
Reinforcement Learning to combat the non-differentiability
issue in text GANs.
GEC SREEKRISHNAPURAM GAN 28 / 34
Types Of GAN(SEQGAN cont...)
• The generator is treated as an RL agent.
• previous tokens are the states (stored in the hidden states)
and the action is the next token to generate.
• The discriminator is fed with both real and synthetic data to
local the difference.
• To evaluate some partial sequence, they use another
generator.
GEC SREEKRISHNAPURAM GAN 29 / 34
Types Of GAN(SEQGAN cont...)
• Finally completing the sentence .ie completing the action it
will get some rewards(in this case from the discriminator) how
good the sentence is?
• For picking the right action from the particular state using the
concept of policy.
• For optimizing the policy gradient methods are used.
GEC SREEKRISHNAPURAM GAN 30 / 34
Applications
Different types of GAN and its applications:
GEC SREEKRISHNAPURAM GAN 31 / 34
Conclusion
Conclusion:
GANs are one of the new state of the art neural networks which
can be used to do many things.There is a lot of active research in
the field to apply GANs for language tasks, to improve their
stability and ease of training, and so on. They are already being
applied in industry for a variety of applications ranging from
interactive image editing, 3D shape estimation, drug discovery,
semi-supervised learning to robotics etc.
GEC SREEKRISHNAPURAM GAN 32 / 34
References
GEC SREEKRISHNAPURAM GAN 33 / 34
THANK YOU
GEC SREEKRISHNAPURAM GAN 34 / 34