Skip to content

Gradient of gradient explodes(nan) when training WGAN-GP on Mnist #2534

@MeJn

Description

@MeJn

I was using WGAN-GP training to generate handwritten character images on Mnist dataset. Thanks for caogang's code, I modified his network structure, the DCGAN was applied to my WGAN-GP's structure.
During the training process, a sudden explosion (nan) of the gradients occurred, and the location of the explosion was after the backward propagation using the gradient penalty loss.

Before the explosion, the gradients and the parameters was always normal and there was no tendency to gradually increase, so I thought it was the problem of calculating the gradient of the gradient.

Metadata

Metadata

Assignees

Labels

todoNot as important as medium or high priority tasks, but we will work on these.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions