[jit] Autodiff bug on double backward for early expiration of grad_accumulator 

## 🐛 Bug

Our autodiff infrastructure have bugs on double backward for early expiration of grad_accumulator for some formulas (i.e. layer_norm, linear). 

It gives error like below

```
  File "test/test_jit.py", line 659, in checkTrace
    grads2_ge = torch.autograd.grad(l2_ge, flattened_recording_inputs, allow_unused=allow_unused)
  File "/scratch/wanchaol/local/pytorch/torch/autograd/__init__.py", line 149, in grad
    inputs, allow_unused)
RuntimeError: No grad accumulator for a saved leaf!
```

This is happening on master now, the bug is kinda complicated to trigger: when we add an non-trivial AD formula, sometimes non-trivial models will throw the above error, while in pure Autograd mode it runs fine. 

## To Reproduce

I noticed this problem appears in #20284 , after I added an AD formula for `linear` in #20039. 

1. git fetch origin; git checkout gh/wanchaol/5/origin
1. python setup.py develop
1. python test/test_jit.py TestEndToEndHybridFrontendModels.test_vae


## Expected behavior

The test will test in autograd mode and in JIT Autodiff mode, the autograd mode works fine, but the JIT autodiff second derivative gives the above error, which I think it related to our autodiff infrastructure early release one of the leaf Variable 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[jit] Autodiff bug on double backward for early expiration of grad_accumulator #19769

🐛 Bug

To Reproduce

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[jit] Autodiff bug on double backward for early expiration of grad_accumulator #19769

Description

🐛 Bug

To Reproduce

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions