Skip to content

Conversation

@yqwangustc
Copy link

Summary:
as discussed at #21244, we
found some values in log_beta are not properly initialized. This diff will 1)
initialize all log_beta to -inf; 2) fix a tricky compare condition; 3) zero all
the gradient elements corresponding to padding to zero.

Offline experiments show that this diff can fix previous seen NaN loss.

Differential Revision: D15637977

Summary:
as discussed at pytorch#21244, we
found some values in log_beta are not properly initialized. This diff will 1)
initialize all log_beta to -inf; 2) fix a tricky compare condition; 3) zero all
the gradient elements corresponding to padding to zero.

Offline experiments show that this diff can fix previous seen NaN loss.

Differential Revision: D15637977

fbshipit-source-id: 8bc35098aade6aa2035f71f499250883f09b0f25
@pytorchbot pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators labels Jun 5, 2019
Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we'd add a test adding NaNs in the padded part. Other than that, I'm super-thrilled!

@yqwangustc yqwangustc closed this Jun 5, 2019
@yqwangustc yqwangustc reopened this Jun 5, 2019
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jun 5, 2019
Summary:
Pull Request resolved: pytorch/pytorch#21392

as discussed at pytorch/pytorch#21244, we
found some values in log_beta are not properly initialized. This diff will 1)
initialize all log_beta to -inf; 2) fix a tricky compare condition; 3) zero all
the gradient elements corresponding to padding to zero.

Offline experiments show that this diff can fix previous seen NaN loss.

Differential Revision: D15637977

fbshipit-source-id: 477008a5e11aae946bd2aa401ab7e0c513421af0
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in b460a19.

@yqwangustc yqwangustc deleted the export-D15637977 branch June 5, 2019 17:51
@soumith soumith changed the title Per discussion at https://github.com/pytorch/pytorch/pull/21244, fix bugs in Stabilize CTCLoss by initializing log_beta values to -inf, fixing comparisons, zeroing padding gradients Jul 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: cuda Related to torch.cuda, and CUDA support in general

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants