Zero gradients beyond a certain buffer size on CUDA

## 🐛 Bug

There is an issue with how blocks/buffers are being allocated on CUDA. This leads to zeroed gradients in the example below for those matrices that are beyond the `65535`th entry. 

This only seems to affect the nightly build. 

## To Reproduce

In the example below the assert fails because the last entry is all zeros instead of the identity matrix.  Decrement `n` by 1 and it works as expected.

```python
import torch
n = (1 << 16) # - 1
x = torch.rand(n, 2, 2, device='cuda').requires_grad_()
a = x[:, [0], [0]]
b = x[:, [1], [1]]
s = (a + b).sum()
s.backward()
assert torch.allclose(x.grad, torch.eye(2, out=x.new(2, 2)))  # fails
```

It's probably related to the following which works on the nightly build for `n` up to `2^19 - 7` while the limit on the lastest release is `2^16`:

```python
n = (1 << 19) - 7 # - 1
x = torch.rand(n, 2, 2, device='cuda')
x = (x @ x.transpose(-2, -1)).add_(torch.eye(2, out=x.new(2, 2)))
y = x.detach().requires_grad_()
l = y.cholesky()  # <-- fails here
logdet = 2 * l.diagonal(dim1=1, dim2=2).log().sum()
logdet.backward()
assert torch.allclose(y.grad, y.inverse())
```

Note that here, other than the change in the limit, I get an error when the threshold is passed: `RuntimeError: CUDA error: invalid configuration argument`. (Which becomes clear after one pastes it in their favorite search engine.)  So I would expect something like this for the first code snippet too.

## Environment

Last nightly build with CUDA 10 and cuDNN 7.5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Zero gradients beyond a certain buffer size on CUDA #22843

🐛 Bug

To Reproduce

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Zero gradients beyond a certain buffer size on CUDA #22843

Description

🐛 Bug

To Reproduce

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions