Skip to content

nn.CrossEntropyLoss() yields wrong output for big logits #11752

@fab-jul

Description

@fab-jul

Issue description

nn.CrossEntropyLoss() yields incorrect results for big logits.

Code example

import torch
from torch import nn

crit = nn.CrossEntropyLoss()

target = torch.ones(1, dtype=torch.long)

y_small = torch.ones(1, 2, dtype=torch.float32)
y_big = y_small * 1e8

loss_small = crit(y_small, target)
loss_big = crit(y_big, target)
loss_fixed = crit(y_big - y_big.max(), target)

print('loss_small = {}\n'
      'loss_big   = {}\n'
      'loss_fixed = {}'.format(
        loss_small, loss_big, loss_fixed))

loss_small should be equal to loss_big, as both y_small and y_big contain logits that correspond to the same distribution. However, the output of the above is:

loss_small = 0.6931471824645996
loss_big   = 0.0
loss_fixed = 0.6931471824645996

Note that the output is correct (loss_fixed == loss_small) if I apply the trick softmax(x) = softmax(x - c) for some c, in particular, for c=max(x).

This happens on CPU and GPU.

System Info

PyTorch version: 0.4.1
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Debian GNU/Linux 8 (jessie)
GCC version: (Debian 4.9.2-10+deb8u1) 4.9.2
CMake version: version 3.0.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration:
GPU 0: TITAN Xp

Nvidia driver version: 384.130
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy (1.14.3)
[pip] torch (0.4.1)
[pip] torchvision (0.2.1)
[conda] pytorch 0.4.1 py36_cuda9.0.176_cudnn7.1.2_1 pytorch
[conda] torchvision 0.2.1 py36_1 pytorch

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions