-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Issue description
nn.CrossEntropyLoss() yields incorrect results for big logits.
Code example
import torch
from torch import nn
crit = nn.CrossEntropyLoss()
target = torch.ones(1, dtype=torch.long)
y_small = torch.ones(1, 2, dtype=torch.float32)
y_big = y_small * 1e8
loss_small = crit(y_small, target)
loss_big = crit(y_big, target)
loss_fixed = crit(y_big - y_big.max(), target)
print('loss_small = {}\n'
'loss_big = {}\n'
'loss_fixed = {}'.format(
loss_small, loss_big, loss_fixed))loss_small should be equal to loss_big, as both y_small and y_big contain logits that correspond to the same distribution. However, the output of the above is:
loss_small = 0.6931471824645996
loss_big = 0.0
loss_fixed = 0.6931471824645996
Note that the output is correct (loss_fixed == loss_small) if I apply the trick softmax(x) = softmax(x - c) for some c, in particular, for c=max(x).
This happens on CPU and GPU.
System Info
PyTorch version: 0.4.1
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Debian GNU/Linux 8 (jessie)
GCC version: (Debian 4.9.2-10+deb8u1) 4.9.2
CMake version: version 3.0.2
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration:
GPU 0: TITAN Xp
Nvidia driver version: 384.130
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy (1.14.3)
[pip] torch (0.4.1)
[pip] torchvision (0.2.1)
[conda] pytorch 0.4.1 py36_cuda9.0.176_cudnn7.1.2_1 pytorch
[conda] torchvision 0.2.1 py36_1 pytorch