fixed a newly introduced regression in softmax #10066

ktarplee · 2018-07-31T11:05:38Z

There is a regression in softmin in 0.4.1 that was not present in 0.4.0. The behavior of softmin(x) should match softmax(-x) however instead it is implemented (in v0.4.1) as -softmax(x). These are not the same. The fix is trivial because the bug is due to operator precedence.

This is a major regression that broke my training. I'm not sure how a unit test did not catch this.

x = torch.tensor([1, 2, 3.5, 4])
print(F.softmin(x, dim=0)) # this has the wrong output in 0.4.1 but correct in 0.4.0
print(F.softmax(-x, dim=0)) # this is what softmax should be
print(F.softmax(x, dim=0))
print(-F.softmax(x, dim=0)) # this is how softmax is implemented incorrectly

In 0.4.1 this produces
tensor([-0.0278, -0.0755, -0.3385, -0.5581])
tensor([0.6668, 0.2453, 0.0547, 0.0332])
tensor([0.0278, 0.0755, 0.3385, 0.5581])
tensor([-0.0278, -0.0755, -0.3385, -0.5581])

In 0.4.0 this produces the correct values
tensor([ 0.6668, 0.2453, 0.0547, 0.0332])
tensor([ 0.6668, 0.2453, 0.0547, 0.0332])
tensor([ 0.0278, 0.0755, 0.3385, 0.5581])
tensor([-0.0278, -0.0755, -0.3385, -0.5581])

soumith · 2018-07-31T17:00:50Z

this is bad, sorry about the regression. Would you be up for adding a test in test_nn.py?

ktarplee · 2018-07-31T17:47:07Z

I would be happy to add to test_nn.py however it looks like somewhat of a pain to build and test pytorch from source.

soumith · 2018-07-31T18:34:16Z

@ktarplee no worries, we'll add it and push this PR through.

ssnl · 2018-07-31T20:05:59Z

Thanks for the fix. Can you also instead submit this PR to the master branch?

soumith · 2018-07-31T22:15:55Z

@ssnl i've changed it to be on master.

facebook-github-bot

soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: There is a regression in softmin in 0.4.1 that was not present in 0.4.0. The behavior of softmin(x) should match softmax(-x) however instead it is implemented (in v0.4.1) as -softmax(x). These are not the same. The fix is trivial because the bug is due to operator precedence. This is a major regression that broke my training. I'm not sure how a unit test did not catch this. ``` x = torch.tensor([1, 2, 3.5, 4]) print(F.softmin(x, dim=0)) # this has the wrong output in 0.4.1 but correct in 0.4.0 print(F.softmax(-x, dim=0)) # this is what softmax should be print(F.softmax(x, dim=0)) print(-F.softmax(x, dim=0)) # this is how softmax is implemented incorrectly ``` In 0.4.1 this produces tensor([-0.0278, -0.0755, -0.3385, -0.5581]) tensor([0.6668, 0.2453, 0.0547, 0.0332]) tensor([0.0278, 0.0755, 0.3385, 0.5581]) tensor([-0.0278, -0.0755, -0.3385, -0.5581]) In 0.4.0 this produces the correct values tensor([ 0.6668, 0.2453, 0.0547, 0.0332]) tensor([ 0.6668, 0.2453, 0.0547, 0.0332]) tensor([ 0.0278, 0.0755, 0.3385, 0.5581]) tensor([-0.0278, -0.0755, -0.3385, -0.5581]) Pull Request resolved: pytorch#10066 Differential Revision: D9106995 Pulled By: soumith fbshipit-source-id: 7332503c6077e8461ad6cd72422c749cf6ca595b

ktarplee requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners July 31, 2018 11:05

soumith approved these changes Jul 31, 2018

View reviewed changes

soumith changed the base branch from v0.4.1 to master July 31, 2018 22:14

Kyle M. Tarplee and others added 2 commits July 31, 2018 18:15

fixed a newly introduced regression in softmax

a1e487c

add basic test

dfe459a

soumith force-pushed the softmin branch from a52929b to dfe459a Compare July 31, 2018 22:15

soumith approved these changes Jul 31, 2018

View reviewed changes

facebook-github-bot reviewed Jul 31, 2018

View reviewed changes

facebook-github-bot closed this in aae3732 Aug 1, 2018

ezyang added open source merged labels Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixed a newly introduced regression in softmax #10066

fixed a newly introduced regression in softmax #10066

Uh oh!

ktarplee commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

ktarplee commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

ssnl commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fixed a newly introduced regression in softmax #10066

fixed a newly introduced regression in softmax #10066

Uh oh!

Conversation

ktarplee commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

ktarplee commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

ssnl commented Jul 31, 2018

Uh oh!

soumith commented Jul 31, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants