Skip to content

amp + eager backend + training failing for some timm models #97382

@Chillee

Description

@Chillee

🐛 Describe the bug

python benchmarks/dynamo/timm_models.py --backend eager --only mnasnet_100 --accuracy --amp --training

Results in

[2023-03-22 21:57:03,055] torch._dynamo.utils: [ERROR] RMSE (res-fp64): 0.09759, (ref-fp64): 0.01743 and shape=torch.Size([8, 1000])

for me.

interestingly, switching to the aot_eager backend allows it to pass.

python benchmarks/dynamo/timm_models.py --backend eager --only mnasnet_100 --accuracy --amp --training

The cause of this breakage is #95416.

cc: @yanboliang @davidberard98

Versions

N/A

cc @ezyang @soumith @msaroufim @wconstab @ngimel @bdhirsh @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire

Metadata

Metadata

Assignees

Labels

module: dynamooncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions