-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
🐛 Describe the bug
Inductor code
pytorch/torch/_inductor/compile_fx.py
Line 317 in 27302a4
| and not torch.backends.cuda.matmul.allow_tf32 |
(Worker_TP1 pid=2300007) /data/youkaichao/uv_envs/tmp_test_m2/lib/python3.12/site-packages/torch/backends/cuda/init.py:131: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
(Worker_TP1 pid=2300007) return torch._C._get_cublas_allow_tf32()
it happens when vllm uses torch.compile (on by default).
Versions
pytorch 2.9
cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @aakhundov @coconutruben