ACtuAL: Actor-Critic Under Adversarial Learning

joseph pal

ACtuAL: Actor-Critic Under Adversarial Learning

joseph pal

2017, ArXiv

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator via back-propagation. This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function. These difficulties extend to the reinforcement learning setting when the action space is composed of discrete decisions. We address these issues by reframing the GAN framework so that the generator is no longer trained using gradients through the discriminator, but is instead trained using a learned critic in the actor-critic framework with a Temporal Difference (TD) objective. This is a natural fit for sequence modeling and w...

Kaihe Xu

2019

This paper presents a novel approach to train GANs for discrete sequence generation without resorting to an explicit neural network as the discriminator. We show that when an alternative mini-max optimization procedure is performed for the value function where a closed form solution for the discriminator exists in the maximization step, it is equivalent to directly optimizing the Jenson-Shannon divergence (JSD) between the generator’s distribution and the empirical distribution over the training data without sampling from the generator, thus optimizing the JSD becomes computationally tractable to train the generator that generates sequences of discrete data. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over existing methods to train GANs that generate discrete sequences.

Log In

ACtuAL: Actor-Critic Under Adversarial Learning

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers