Skip to content

Recompilation exausted with size mismatch at index 2 #139474

@bhack

Description

@bhack

🐛 Describe the bug

I am going to exhaust recompilation also with extended torch._dynamo hit config.cache_size_limit (64) on this module

Just enabling the compilation on this module
https://github.com/hustvl/ViTMatte/blob/main/modeling/decoder/detail_capture.py#L132-L140

Error logs

 function: 'forward' (/workspace/modeling/decoder/detail_capture.py:131)
[rank6]:W1101 02:43:39.336000 143 site-packages/torch/_dynamo/convert_frame.py:896] [1/64]    last reason: 1/0: tensor 'L['images']' size mismatch at index 2. expected 1696, actual 1248
[rank6]:W1101 02:43:39.336000 143 site-packages/torch/_dynamo/convert_frame.py:896] [1/64] To log all recompilation reasons, use TORCH_LOGS="recompiles".
[rank6]:W1101 02:43:39.336000 143 site-packages/torch/_dynamo/convert_frame.py:896] [1/64] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
[rank3]:W1101 02:43:40.706000 140 site-packages/torch/_dynamo/convert_frame.py:896] [1/64] torch._dynamo hit config.cache_size_limit (64)
[rank3]:W1101 02:43:40.706000 140 site-packages/torch/_dynamo/convert_frame.py:896] [1/64]    function: 'forward' (/workspace/modeling/decoder/detail_capture.py:131)
[rank3]:W1101 02:43:40.706000 140 site-packages/torch/_dynamo/convert_frame.py:896] [1/64]    last reason: 1/0: tensor 'L['images']' size mismatch at index 2. expected 1248, actual 1376

Minified repro

No response

Versions

nightly

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @chauhang @penguinwu @ezyang @bobrenjc93

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: ddpIssues/PRs related distributed data parallel trainingmodule: dynamic shapesoncall: distributedAdd this issue/PR to distributed oncall triage queueoncall: pt2pt2d-triage-nov2024triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions