0% found this document useful (0 votes)

35 views9 pages

Understanding GAN Loss Functions

Uploaded by

sankeerthrockz2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views9 pages

Understanding GAN Loss Functions

Uploaded by

sankeerthrockz2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Understanding GAN Loss Functions

https://neptune.ai/blog/gan-loss-functions

Ian Goodfellow introduced Generative Adversarial Networks (GAN) in

2014. It was one of the most beautiful, yet straightforward
implementations of Neural Networks, and it involved two Neural
Networks competing against each other. Yann LeCun, the founding
father of Convolutional Neural Networks (CNNs), described GANs as
“the most interesting idea in the last ten years in Machine Learning“.

In simple words, the idea behind GANs can be summarized like this:

● Two Neural Networks are involved.

● One of the networks, the Generator, starts off with a random data
distribution and tries to replicate a particular type of distribution.
● The other network, the Discriminator, through subsequent training,
gets better at classifying a forged distribution from a real one.
● Both of these networks play a min-max game where one is trying
to outsmart the other.

when you actually try to implement them, they often don’t learn the way you
expect them to. One common reason is the overly simplistic loss function.

GANs and the different variations to their loss functions, help us to get a
better insight into how the GAN works while addressing the unexpected
performance issues.

GAN challenges, loss functions, cross entropy,

minimax loss, Wasserstein loss.
Standard GAN loss function (min-max GAN loss)

The standard GAN loss function, also known as the min-max loss, was
first described in a 2014 paper by Ian Goodfellow et al., titled “Generative
Adversarial Networks“.

The generator tries to minimize this function while the discriminator tries
to maximize it. Looking at it as a min-max game, this formulation of the
loss seemed effective.

In practice, it saturates for the generator, meaning that the generator

quite frequently stops training if it doesn’t catch up with the discriminator.

The Standard GAN loss function can further be categorized into two
parts: Discriminator loss and Generator loss.

Discriminator loss

While the discriminator is trained, it classifies both the real data and the
fake data from the generator.

It penalizes itself for misclassifying a real instance as fake, or a fake

instance (created by the generator) as real, by maximizing the below
function.

● log(D(x)) refers to the probability that the generator is rightly

classifying the real image,
● maximizing log(1-D(G(z))) would help it to correctly label the fake
image that comes from the generator.

Generator loss
While the generator is trained, it samples random noise and produces an
output from that noise. The output then goes through the discriminator
and gets classified as either “Real” or “Fake” based on the ability of the
discriminator to tell one from the other.

The generator loss is then calculated from the discriminator’s

classification – it gets rewarded if it successfully fools the discriminator,
and gets penalized otherwise.

The following equation is minimized to training the generator:

Non-Saturating GAN Loss

A subtle variation of the standard loss function is used where the

generator maximizes the log of the discriminator probabilities –
log(D(G(z))).

This change is inspired by framing the problem from a different

perspective, where the generator seeks to maximize the probability of
images being real, instead of minimizing the probability of an image
being fake.

This avoids generator saturation through a more stable weight update

mechanism. In his blog, Daniel Takeshi compares the Non-Saturating
GAN Loss along with some other variations.

Challenges with GAN loss functions

More often than not, GANs tend to show some inconsistencies in

performance.

Most of these problems are associated with their training and are an
active area of research.
Let’s look at some of them in detail:

Mode Collapse

This issue is on the unpredictable side of things. It wasn’t foreseen until

someone noticed that the generator model could only generate one or a
small subset of different outcomes or modes.

Usually, we would want our GAN to produce a range of outputs. We

would expect, for example, another face for every random input to the
face generator that we design.

Instead, through subsequent training, the network learns to model a

particular distribution of data, which gives us a monotonous output which
is illustrated below.

In the process of training, the generator is always trying to find the one
output that seems most plausible to the discriminator.

Because of that, the discriminator’s best strategy is always to reject the

output of the generator.

But if the next generation of discriminator gets stuck in a local minimum

and doesn’t find its way out by getting its weights even more optimized,
it’d get easy for the next generator iteration to find the most plausible
output for the current discriminator.
This way, it will keep on repeating the same output and refrain from any
further training.

GANs Failure Modes: How to Identify and Monitor Them

Vanishing Gradients

This phenomenon happens when the discriminator performs significantly

better than the generator. Either the updates to the discriminator are
inaccurate, or they disappear.

One of the proposed reasons for this is that the generator gets heavily
penalized, which leads to saturation in the value post-activation function,
and the eventual gradient vanishing.

Convergence

Since there are two networks being trained at the same time, the
problem of GAN convergence was one of the earliest, and quite possibly
one of the most challenging problems since it was created.

The utopian situation where both networks stabilize and produce a

consistent result is hard to achieve in most cases. One explanation for
this problem is that as the generator gets better with next epochs, the
discriminator performs worse because the discriminator can’t easily tell
the difference between a real and a fake one.

If the generator succeeds all the time, the discriminator has a 50%
accuracy, similar to that of flipping a coin. This poses a threat to the
convergence of the GAN as a whole.

The image below shows this problem in particular:

As the discriminator’s feedback loses its meaning over subsequent
epochs by giving outputs with equal probability, the generator may
deteriorate its own quality if it continues to train on these junk training
signals.

This medium article by Jonathan Hui takes a comprehensive look at all

the aforementioned problems from a mathematical perspective.

Alternate GAN loss functions

Several different variations to the original GAN loss have been proposed
since its inception. To a certain extent, they addressed the challenges we
discussed earlier.

We will discuss some of the most popular ones which alleviated the
issues, or are employed for a specific problem statement:
Wasserstein Generative Adversarial Network (WGAN)

This is one of the most powerful alternatives to the original GAN loss. It
tackles the problem of Mode Collapse and Vanishing Gradient.

In this implementation, the activation of the output layer of the

discriminator is changed from sigmoid to a linear one. This simple
change influences the discriminator to give out a score instead of a
probability associated with data distribution, so the output does not have
to be in the range of 0 to 1.

Here, the discriminator is called critique instead, because it doesn’t

actually classify the data strictly as real or fake, it simply gives them a
rating.

Following loss functions are used to train the critique and the
discriminator, respectively.

The output of the critique and the generator is not in probabilistic terms
(between 0 and 1), so the absolute difference between critique and
generator outputs is maximized while training the critique network.

Similarly, the absolute value of the generator function is maximized while

training the generator network.

The original paper used RMSprop followed by clipping to prevent the

weights values to explode:

Conditional Generative Adversarial Network (CGAN)

This version of GAN is used to learn a multimodal model. It basically

generates descriptive labels which are the attributes associated with the
particular image that was not part of the original training data.
CGANs are mainly employed in image labelling, where both the
generator and the discriminator are fed with some extra information y
which works as an auxiliary information, such as class labels from or
data associated with different modalities.

The conditioning is usually done by feeding the information y into both

the discriminator and the generator, as an additional input layer to it.

The following modified loss function plays the same min-max game as in
the Standard GAN Loss function. The only difference between them is
that a conditional probability is used for both the generator and the
discriminator, instead of the regular one.

Why conditional probability? Because we are feeding in some auxiliary

information(the green points), which helps in making it a multimodal
model, as shown in the diagram below:
Figure 1:
Conditional adversarial net
This medium article by Jonathan Hui delves deeper into CGANs and
discusses the mathematics behind it.

Summary

In this blog, we discussed:

● The original Generative Adversarial Networks loss functions along

with the modified ones.
● Different challenges of employing them in real-life scenarios.
● Alternatives loss functions like WGAN and C-GAN.

Generative Adversarial Networks and Some of GAN Applications - Everything You Need To Know
No ratings yet
Generative Adversarial Networks and Some of GAN Applications - Everything You Need To Know
26 pages
MODULE 6 - 2 Generative Adversarial Network (GAN)
No ratings yet
MODULE 6 - 2 Generative Adversarial Network (GAN)
33 pages
Understanding GAN Specialization
No ratings yet
Understanding GAN Specialization
20 pages
Understanding Generative Adversarial Networks
No ratings yet
Understanding Generative Adversarial Networks
4 pages
Week 3 - Post - GAN
No ratings yet
Week 3 - Post - GAN
38 pages
What Are Generative Adversarial Networks
No ratings yet
What Are Generative Adversarial Networks
14 pages
Generative Adversarial Networks: Akrit Mohapatra Ece Department, Virginia Tech
No ratings yet
Generative Adversarial Networks: Akrit Mohapatra Ece Department, Virginia Tech
21 pages
What Are Generative Adversarial Networks (GANs) - Simplilearn
No ratings yet
What Are Generative Adversarial Networks (GANs) - Simplilearn
19 pages
Indoor Localization Using Data Augmentation Via Selective Generative Adversarial Networks
No ratings yet
Indoor Localization Using Data Augmentation Via Selective Generative Adversarial Networks
11 pages
Lung Cancer Prediction with CNN
No ratings yet
Lung Cancer Prediction with CNN
28 pages
Generative Adversarial Network An Overview of Theory and Applications
No ratings yet
Generative Adversarial Network An Overview of Theory and Applications
9 pages
CNN PPT For Lung Cancer Detection Using Histopathological Images.
No ratings yet
CNN PPT For Lung Cancer Detection Using Histopathological Images.
10 pages
Day 2 Module 1 - Introduction To Generative AI
No ratings yet
Day 2 Module 1 - Introduction To Generative AI
15 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Deep Learning for Image Upscaling
No ratings yet
Deep Learning for Image Upscaling
12 pages
Lecture 1
No ratings yet
Lecture 1
53 pages
Understanding NSGA-II Algorithm Basics
No ratings yet
Understanding NSGA-II Algorithm Basics
8 pages
TDT4136: Intro to Artificial Intelligence
No ratings yet
TDT4136: Intro to Artificial Intelligence
40 pages
Neural Networks & Deep Learning Basics
100% (1)
Neural Networks & Deep Learning Basics
24 pages
Bayesian Networks in Class Exercises Solutions
No ratings yet
Bayesian Networks in Class Exercises Solutions
4 pages
Comp CFL 10
No ratings yet
Comp CFL 10
38 pages
9-M6 - Classical Planning, Planning As State-Space Search-08-10-2024
No ratings yet
9-M6 - Classical Planning, Planning As State-Space Search-08-10-2024
30 pages
8 Workbook Hidden Layer
No ratings yet
8 Workbook Hidden Layer
25 pages
SVM: Understanding the Optimal Hyperplane
100% (1)
SVM: Understanding the Optimal Hyperplane
37 pages
3 Workbook Linear
No ratings yet
3 Workbook Linear
25 pages
Decision Trees & ID3 for Beginners
No ratings yet
Decision Trees & ID3 for Beginners
109 pages
Apriori Algorithm for Book, Phone, and Movie Data
No ratings yet
Apriori Algorithm for Book, Phone, and Movie Data
23 pages
SVM Applications in Classification
No ratings yet
SVM Applications in Classification
12 pages
1.2 Problem Solving - State Space Search (AIML)
No ratings yet
1.2 Problem Solving - State Space Search (AIML)
29 pages
8-Puzzle Problem and Search Strategies
No ratings yet
8-Puzzle Problem and Search Strategies
25 pages
Matplotlib Fundamentals
No ratings yet
Matplotlib Fundamentals
31 pages
Heuristic Search Techniques Explained
No ratings yet
Heuristic Search Techniques Explained
49 pages
Generative Adversarial Networks Review 1-06-08-1.edit
No ratings yet
Generative Adversarial Networks Review 1-06-08-1.edit
24 pages
Awesome-Generative-Ai-Guide:interview - Prep:60 - Gen - Ai - Questions - MD at Main Aishwaryanr:awesome-Generative-Ai-Guide
No ratings yet
Awesome-Generative-Ai-Guide:interview - Prep:60 - Gen - Ai - Questions - MD at Main Aishwaryanr:awesome-Generative-Ai-Guide
48 pages
HBase Presentation
No ratings yet
HBase Presentation
23 pages
GANs for Image Synthesis and Editing Survey
No ratings yet
GANs for Image Synthesis and Editing Survey
15 pages
15) EXPLAIN Fitted Q and Deep Q-Learning
No ratings yet
15) EXPLAIN Fitted Q and Deep Q-Learning
17 pages
Search Algorithms in AI: Overview
No ratings yet
Search Algorithms in AI: Overview
105 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
Computer Vision Project Report
100% (1)
Computer Vision Project Report
13 pages
Linear Regression Analysis in Hydrology
No ratings yet
Linear Regression Analysis in Hydrology
15 pages
CSE 465 Exam: Decision Trees & SVMs
No ratings yet
CSE 465 Exam: Decision Trees & SVMs
2 pages
Feature Engineering
No ratings yet
Feature Engineering
6 pages
Algorithms and Combinatorial Optimization
100% (1)
Algorithms and Combinatorial Optimization
91 pages
Introduction
No ratings yet
Introduction
6 pages
Neural Network Loss & Regularization
No ratings yet
Neural Network Loss & Regularization
112 pages
Siamese Network: Shusen Wang
No ratings yet
Siamese Network: Shusen Wang
51 pages
Import Customtkinter As CTK
No ratings yet
Import Customtkinter As CTK
5 pages
Understanding Random Forests in Machine Learning
100% (1)
Understanding Random Forests in Machine Learning
4 pages
DBSCAN Algorithm for Data Scientists
No ratings yet
DBSCAN Algorithm for Data Scientists
10 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
12 pages
Distributed Databases: Solutions To Practice Exercises
No ratings yet
Distributed Databases: Solutions To Practice Exercises
4 pages
AI & Data Science Course Guide
No ratings yet
AI & Data Science Course Guide
14 pages
Knowledge-Based Agents & Wumpus World
No ratings yet
Knowledge-Based Agents & Wumpus World
31 pages
DAA VIT AP 27 Maximum Matching in Bipartite Graphs
No ratings yet
DAA VIT AP 27 Maximum Matching in Bipartite Graphs
6 pages
Compiler Optimization Techniques
No ratings yet
Compiler Optimization Techniques
58 pages
Unit6 Aml
No ratings yet
Unit6 Aml
63 pages
Generative Adversarial Network
No ratings yet
Generative Adversarial Network
19 pages
Understanding Generative Adversarial Networks
No ratings yet
Understanding Generative Adversarial Networks
30 pages
253 125 Concurrency Control NE
No ratings yet
253 125 Concurrency Control NE
48 pages
253 125 Example of A PL
No ratings yet
253 125 Example of A PL
2 pages
C++ Reduced List 2021
No ratings yet
C++ Reduced List 2021
13 pages
C Dynamic Memory Allocation
No ratings yet
C Dynamic Memory Allocation
5 pages
Adobe Scan 12-Mar-2024
No ratings yet
Adobe Scan 12-Mar-2024
1 page
Driver Drowsiness Detection Tech
No ratings yet
Driver Drowsiness Detection Tech
14 pages
GAN Upsampling Techniques
No ratings yet
GAN Upsampling Techniques
6 pages
Real-Time Eye Blink Detection Using General Camera
No ratings yet
Real-Time Eye Blink Detection Using General Camera
8 pages
Unit Testing Concepts and Practices
No ratings yet
Unit Testing Concepts and Practices
19 pages
Cloud Security Challenges and Solutions
No ratings yet
Cloud Security Challenges and Solutions
3 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
89 pages
Helmet & Number Plate Detection Using CNN
No ratings yet
Helmet & Number Plate Detection Using CNN
65 pages
Quantum-Inspired Multimodal Sentiment Analysis
No ratings yet
Quantum-Inspired Multimodal Sentiment Analysis
20 pages
All in On Ai How Smart Companies Win Big With Artificial Intelligence
No ratings yet
All in On Ai How Smart Companies Win Big With Artificial Intelligence
196 pages
Deep Generative Modeling
No ratings yet
Deep Generative Modeling
96 pages
The Cognitive Electronic Warfare in The Age of Artificial 13gbejqbzluq
No ratings yet
The Cognitive Electronic Warfare in The Age of Artificial 13gbejqbzluq
10 pages
Unsafechain: Enhancing Reasoning Model Safety Via Hard Cases
No ratings yet
Unsafechain: Enhancing Reasoning Model Safety Via Hard Cases
14 pages
Tentative Course List (Jan - April 2024)
No ratings yet
Tentative Course List (Jan - April 2024)
155 pages
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
No ratings yet
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
75 pages
S4CP01
100% (1)
S4CP01
66 pages
Fake News Detector Project Report
No ratings yet
Fake News Detector Project Report
54 pages
TensorFlow 2.0 Image Classification Guide
No ratings yet
TensorFlow 2.0 Image Classification Guide
20 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Azure ML Designer Module Guide
No ratings yet
Azure ML Designer Module Guide
283 pages
Deep Learning Based Face Recognition Method Using Siamese Network
No ratings yet
Deep Learning Based Face Recognition Method Using Siamese Network
6 pages
Mixture Models for Data Clustering
No ratings yet
Mixture Models for Data Clustering
7 pages
AI Project Cycle
No ratings yet
AI Project Cycle
7 pages
Cluster Head Selection Algorithm Using Machine Learning
No ratings yet
Cluster Head Selection Algorithm Using Machine Learning
4 pages
FELA: A Multi-Agent Evolutionary System For Feature Engineering of Industrial Event Log Data
No ratings yet
FELA: A Multi-Agent Evolutionary System For Feature Engineering of Industrial Event Log Data
14 pages
Kolmogorov-Arnold Networks Are Radial Basis Function Networks
No ratings yet
Kolmogorov-Arnold Networks Are Radial Basis Function Networks
4 pages
Dhruvil Resume
No ratings yet
Dhruvil Resume
3 pages
Stack Overflow Tagging with AI
No ratings yet
Stack Overflow Tagging with AI
10 pages
A Review On Sentiment Analysis Using Machine Learning
No ratings yet
A Review On Sentiment Analysis Using Machine Learning
5 pages
Deep Learning Seminar for IT Students
100% (1)
Deep Learning Seminar for IT Students
23 pages
DoS Attack Detection with ML & Neural Networks
No ratings yet
DoS Attack Detection with ML & Neural Networks
5 pages
Prof MCS-DS Degree Program Worksheet
No ratings yet
Prof MCS-DS Degree Program Worksheet
1 page
SCT Unit-1 Notes
No ratings yet
SCT Unit-1 Notes
58 pages
Feature Generation and Selection
No ratings yet
Feature Generation and Selection
12 pages
Python Projects for Tech Enthusiasts
No ratings yet
Python Projects for Tech Enthusiasts
15 pages
Crime Prediction Using Machine Learning
No ratings yet
Crime Prediction Using Machine Learning
60 pages