Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2017, ArXiv
Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator via back-propagation. This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function. These difficulties extend to the reinforcement learning setting when the action space is composed of discrete decisions. We address these issues by reframing the GAN framework so that the generator is no longer trained using gradients through the discriminator, but is instead trained using a learned critic in the actor-critic framework with a Temporal Difference (TD) objective. This is a natural fit for sequence modeling and w...
2019
This paper presents a novel approach to train GANs for discrete sequence generation without resorting to an explicit neural network as the discriminator. We show that when an alternative mini-max optimization procedure is performed for the value function where a closed form solution for the discriminator exists in the maximization step, it is equivalent to directly optimizing the Jenson-Shannon divergence (JSD) between the generator’s distribution and the empirical distribution over the training data without sampling from the generator, thus optimizing the JSD becomes computationally tractable to train the generator that generates sequences of discrete data. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over existing methods to train GANs that generate discrete sequences.
Lecture Notes in Computer Science, 2019
Predicting the future is a fantasy but practicality work. It is the key component to intelligent agents, such as self-driving vehicles, medical monitoring devices and robotics. In this work, we consider generating unseen future frames from previous observations, which is notoriously hard due to the uncertainty in frame dynamics. While recent works based on generative adversarial networks (GANs) made remarkable progress, there is still an obstacle for making accurate and realistic predictions. In this paper, we propose a novel GAN based on inter-frame difference to circumvent the difficulties. More specifically, our model is a multi-stage generative network, which is named the Difference Guided Generative Adversarial Network (DGGAN). The DGGAN learns to explicitly enforce future-frame predictions that is guided by synthetic interframe difference. Given a sequence of frames, DGGAN first uses dual paths to generate meta information. One path, called Coarse Frame Generator, predicts the coarse details about future frames, and the other path, called Difference Guide Generator, generates the difference image which include complementary fine details. Then our coarse details will then be refined via guidance of difference image under the support of GANs. With this model and novel architecture, we achieve state-of-theart performance for future video prediction on UCF-101, KITTI.
2021
Generative Adversarial Network, as a promising research direction in the AI community, recently attracts considerable attention due to its ability to generating high-quality realistic data. GANs are a competing game between two neural networks trained in an adversarial manner to reach a Nash equilibrium. Despite the improvement accomplished in GANs in the last years, there remain several issues to solve. In this way, how to tackle these issues and make advances leads to rising research interests. This paper reviews literature that leverages the game theory in GANs and addresses how game models can relieve specific generative models' challenges and improve the GAN's performance. In particular, we firstly review some preliminaries, including the basic GAN model and some game theory backgrounds. After that, we present our taxonomy to summarize the state-of-the-art solutions into three significant categories: modified game model, modified architecture, and modified learning meth...
International Journal of Advanced Computer Science and Applications
Dialogue management systems are commonly applied in daily life, such as online shopping, hotel booking, and driving booking. Efficient dialogue management policy helps systems to respond to the user in an effective way. Policy learning is a complex task to build a dialogue system. There are different approaches have been proposed in the last decade to build a goal-oriented dialogue agent to train the systems with an efficient policy. The Generative adversarial network (GAN) is used in the dialogue generation, in previous works to build dialogue agents by selecting the optimal policy learning. The efficient dialogue policy learning aims to improve the quality of fluency and diversity for generated dialogues. Reinforcement learning (RL) algorithms are used to optimize the policies because the sequence is discrete. In this study, we have proposed a new technique called Cascade Generative Adversarial Network (Cas-GAN) that is combination of the GAN and RL for dialog generation. The Cas-GAN can model the relations between the dialogues (sentences) by using Graph Convolutional Networks (GCN). The graph nodes are consisting of different high level and low-level nodes representing the vertices and edges of the graph. Then, we use the maximum log-likelihood (MLL) approach to train the parameters and choose the best nodes. The experimental results compared with the HRL, RL agents and we got state-of-the-art results.
Generative Adversarial Networks (GANs) have recently attracted considerable attention in the AI community due to their ability to generate high-quality data of significant statistical resemblance to real data. Fundamentally, GAN is a game between two neural networks trained in an adversarial manner to reach a zero-sum Nash equilibrium profile. Despite the improvement accomplished in GANs in the last few years, several issues remain to be solved. This paper reviews the literature on the game-theoretic aspects of GANs and addresses how game theory models can address specific challenges of generative models and improve the GAN's performance. We first present some preliminaries, including the basic GAN model and some game theory background. We then present taxonomy to classify state-of-the-art solutions into three main categories: modified game models, modified architectures, and modified learning methods. The classification is based on modifications made to the basic GAN model by p...
arXiv (Cornell University), 2022
In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models 'weaken' the 'powerful' decoder by applying uniformly random dropout to the decoder input. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.
IEEE, 2024
Generative adversarial networks (GANs) are a cutting-edge approach to generative modeling in deep learning. GAN’s was proposed in 2014 by Ian Goodfellow. Since then, there has been significant growth in adversarial networks. New breakthroughs and innovative approaches in generative adversarial networks can radically elevate and increase the quality of synthetically generated images by extracting patterns from the original datasets. Among the major advancements of GAN, image synthesis is the most prominent and extensively studied application. The concept of adversarial training, where two neural networks compete against each other, has introduced a novel paradigm for learning complex patterns in the images. The paper emphasizes the vital role of GANs in strengthening and fine-tuning datasets, honing further research to create GANs capable of producing high-quality synthetic samples with constrained practice of data.
2021
Generative Modelling has been a very extensive area of research since it finds immense use cases across multiple domains. Various models have been proposed in the recent past including Fully Visible Belief Nets, NADE, MADE, Pixel RNN Variational Auto Encoders, Markov Chain, and Generative Adversarial Networks. Amongst all the models, Generative Adversarial Networks have been consistently showing huge potential and developments in the area of Art, Music, SemiSupervised learning, Handling Missing data, Drug Discovery, and unsupervised learning. This emerging technology has reshaped the research landscape in the field of generative modeling. The research in the area of Generative Adversarial Networks (GANs) was introduced by Ian J. Goodfellow et al in 2014 [1]. However, since its inception, various models have been proposed over the years and are considered state-of-the-art models in generative modeling. In this survey, we provide a comprehensive review of the original GAN model and it...
2020
Generative Adversarial Networks (GANs) are part of the deep generative model family and able to generate synthetic samples based on the underlying distribution of real-world data. With expanding interest new discoveries and recent advances are hard to follow. Recent advancements to stabilize training, will help GANs to open up new domains using adjusted architectures and loss functions. Various findings show, that GANS can be used to generate not only images, but is also useful for text and audio creation. This paper, presents an overview of different GAN architectures, giving summaries of the underlying fundamentals of each presented GAN. Furthermore, this paper presents look into four application domains and lists additional domains. Additionally, this paper summaries datasets and metrics used to evaluate GANs and present recent scientific advancements. Keywords–generative adversarial networks; machine learning; deep learning.
2021
We present an alternative perspective on the training of generative adversarial networks (GANs), showing that the training step for a GAN generator decomposes into two implicit sub-problems. In the first, the discriminator provides new target data to the generator in the form of "inverse examples" produced by approximately inverting classifier labels. In the second, these examples are used as targets to update the generator via least-squares regression, regardless of the main loss specified to train the network. We experimentally validate our main theoretical result and discuss implications for alternative training methods that are made possible by making these sub-problems explicit. We also introduce a simple representation of inductive bias in networks, which we apply to describing the generator's output relative to its regression targets.
ArXiv, 2017
We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL). Current log-likelihood training methods are limited by the discrepancy between their training and testing modes, as models must generate tokens conditioned on their previous guesses rather than the ground-truth tokens. We address this problem by introducing a \textit{critic} network that is trained to predict the value of an output token, given the policy of an \textit{actor} network. This results in a training procedure that is much closer to the test phase, and allows us to directly optimize for a task-specific score such as BLEU. Crucially, since we leverage these techniques in the supervised learning setting rather than the traditional RL setting, we condition the critic network on the ground-truth output. We show that our method leads to improved performance on both a synthetic task, and for German-English machine translation. Our analysis paves ...
2018
In this work we present a new agent architecture, called Reactor, which combines multiple algorithmic and architectural contributions to produce an agent with higher sample-efficiency than Prioritized Dueling DQN (Wang et al., 2016) and Categorical DQN (Bellemare et al., 2017), while giving better run-time performance than A3C (Mnih et al., 2016). Our first contribution is a new policy evaluation algorithm called Distributional Retrace, which brings multi-step off-policy updates to the distributional reinforcement learning setting. The same approach can be used to convert several classes of multi-step policy evaluation algorithms designed for expected value evaluation into distributional ones. Next, we introduce the \b{eta}-leave-one-out policy gradient algorithm which improves the trade-off between variance and bias by using action values as a baseline. Our final algorithmic contribution is a new prioritized replay algorithm for sequences, which exploits the temporal locality of ne...
arXiv (Cornell University), 2023
Actor-critic (AC) methods are widely used in reinforcement learning (RL), and benefit from the flexibility of using any policy gradient method as the actor and value-based method as the critic. The critic is usually trained by minimizing the TD error, an objective that is potentially decorrelated with the true goal of achieving a high reward with the actor. We address this mismatch by designing a joint objective for training the actor and critic in a decision-aware fashion. We use the proposed objective to design a generic, AC algorithm that can easily handle any function approximation. We explicitly characterize the conditions under which the resulting algorithm guarantees monotonic policy improvement, regardless of the choice of the policy and critic parameterization. Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO), and a critic that involves minimizing a closely connected objective. Using simple bandit examples, we provably establish the benefit of the proposed critic objective over the standard squared error. Finally, we empirically demonstrate the benefit of our decision-aware actor-critic framework on simple RL problems. 1 Introduction Reinforcement learning (RL) is a framework for solving problems involving sequential decisionmaking under uncertainty, and has found applications in games [38, 50], robot manipulation tasks [55, 64] and clinical trials [45]. RL algorithms aim to learn a policy that maximizes the long-term return by interacting with the environment. Policy gradient (PG) methods [59, 54, 29, 25, 47] are an important class of algorithms that can easily handle function approximation and structured state-action spaces, making them widely used in practice. PG methods assume a differentiable parameterization of the policy and directly optimize the return with respect to the policy parameters. Typically, a policy's return is estimated by using Monte-Carlo samples obtained via environment interactions [59]. Since the environment is stochastic, this approach results in high variance in the estimated return, leading to higher sample-complexity (number of environment interactions required to learn a good policy). Actor-critic (AC) methods [29, 43, 5] alleviate this issue by using value-based approaches [52, 58] in conjunction with PG methods, and have been empirically successful [20, 23]. In AC algorithms, a value-based method ("critic") is used to approximate a policy's estimated value, and a PG method ("actor") uses this estimate to improve the policy towards obtaining higher returns. Though AC methods have the flexibility of using any method to independently train the actor and critic, it is unclear how to train the two components jointly in order to learn good policies. For example, the critic is typically trained via temporal difference (TD) learning and its objective is to minimize the value estimation error across all states and actions. For large real-world Markov decision processes (MDPs), it is intractable to estimate the values across all states and actions, and 37th Conference on Neural Information Processing Systems (NeurIPS 2023).
2017 IEEE International Conference on Computer Vision (ICCV)
In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods.
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
2020 International Joint Conference on Neural Networks (IJCNN)
One of the challenging problems in sequence generation tasks is the optimized generation of sequences with specific desired goals. Current sequential generative models mainly generate sequences to closely mimic the training data, without direct optimization of desired goals or properties specific to the task. We introduce OptiGAN, a generative model that incorporates both Generative Adversarial Networks (GAN) and Reinforcement Learning (RL) to optimize desired goal scores using policy gradients. We apply our model to text and realvalued sequence generation, where our model is able to achieve higher desired scores out-performing GAN and RL baselines, while not sacrificing output sample diversity.
Recently, generative adversarial networks (GANs) have become a research focus of artificial intelligence. Inspired by two-player zero-sum game, GANs comprise a generator and a discriminator, both trained under the adversarial learning idea. The goal of GANs is to estimate the potential distribution of real data samples and generate new samples from that distribution. Since their initiation, GANs have been widely studied due to their enormous prospect for applications, including image and vision computing, speech and language processing, etc. In this review paper, we summarize the state of the art of GANs and look into the future. Firstly, we survey GANs' proposal background, theoretic and implementation models, and application fields. Then, we discuss GANs' advantages and disadvantages, and their development trends. In particular, we investigate the relation between GANs and parallel intelligence, with the conclusion that GANs have a great potential in parallel systems research in terms of virtual-real interaction and integration. Clearly, GANs can provide substantial algorithmic support for parallel intelligence.
IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2012
Policy gradient based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper therefore describes the state of the art of actorcritic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms in the past few years. A review of several standard and natural actor-critic algorithms follows and the paper concludes with an overview of application areas and a discussion on open issues.
2020 International Joint Conference on Neural Networks (IJCNN), 2020
In this paper, we propose a novel technique for training Generative Adversarial Networks (GANs) using autoencoders. GANs, in recent years, have emerged as one of the most popular generative models. Despite their success, there are several challenges in maintaining the trade-off between diversity and quality of the generated distribution. Our idea stems from the fact that deeper layers of an autoencoder contain high-level feature representation of the input data distribution. Reusing these layers provides GAN with information about the representative characteristics of real data and hence can guide its adversarial training. We call our model Guided GAN since the autoencoder (guiding network) provides a direction to train the GAN (generative network). Guided GAN also minimizes both the forward and reverse Kullback-Leibler (KL) divergence in a single model, exploiting the complementary statistical properties of the two. We conduct extensive experiments and use various metrics for asses...
2017
We propose in this paper a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapsing problem and delivering state-of-the-art results. A minimax formulation was able to establish among a classifier, a discriminator, and a set of generators in a similar spirit with GAN. Generators create samples that are intended to come from the same distribution as the training data, whilst the discriminator determines whether samples are true data or generated by generators, and the classifier specifies which generator a sample comes from. The distinguishing feature is that internal samples are created from multiple generators, and then one of them will be randomly selected as final output similar...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.