GeneralizedRCNN returns NaNs with torch.uint8 inputs

## 🐛 Bug

The  `FasterRCNN` model (and, more generally, the `GeneralizedRCNN` class) expects as input images a list of float PyTorch tensors, but if you try to pass it a list of tensors with dtype `torch.uint8`, the model returns `NaN` values in the normalization step and, as a consequence, in the losses computation.

## To Reproduce

Steps to reproduce the behavior:

1. Load an image as a PyTorch tensor with dtype `torch.uint8`, along with its corresponding target dictionary
1. Create an instance of `FasterRCNN` and pass that image to the model
1. Observe the output of the model, which should be the dictionary of losses with all `NaN` values



## Expected behavior

I would have expected the model to throw an exception or at least a warning. In particular, since the `GeneralizedRCNN` class takes care of transformations such as normalization and resizing, in my opinion it should also check the type of the input images, in order to avoid such errors.

## Environment

PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 10.15.7 (x86_64)
GCC version: Could not collect
Clang version: 12.0.0 (clang-1200.0.32.28)
CMake version: version 3.18.4

Python version: 3.8 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] torch==1.7.1
[pip3] torchvision==0.8.2
[conda] Could not collect

## Additional context

I realized that the error I was facing is caused by the `normalize` function of the `GeneralizedRCNNTransform` class, which relies on the image dtype to convert the mean and standard deviation lists to tensors, so that in the default case (ImageNet mean/std) they contain all zeros.

```python
def normalize(self, image):
        dtype, device = image.dtype, image.device
        mean = torch.as_tensor(self.image_mean, dtype=dtype, device=device)
        std = torch.as_tensor(self.image_std, dtype=dtype, device=device)
        return (image - mean[:, None, None]) / std[:, None, None]  
```

To avoid this problem, a simple `image.float()` would suffice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeneralizedRCNN returns NaNs with torch.uint8 inputs #3228

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GeneralizedRCNN returns NaNs with torch.uint8 inputs #3228

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions