-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[AOTI] Remove explicit abi_compatible setting in tests #138016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138016
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c146673 with merge base 966a1a9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]
|
@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang Differential Revision: [D64439674](https://our.internmc.facebook.com/intern/diff/D64439674) [ghstack-poisoned]
|
@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
malfet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted code is tested code :)
|
@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary: The ABI-compatible mode has been turned on as default in #136534. Removing the non-ABI-compatible logic to greatly simplify the wrapper codegen logic. Differential Revision: [D64439676](https://our.internmc.facebook.com/intern/diff/D64439676) Pull Request resolved: #138009 Approved by: https://github.com/chenyang78 ghstack dependencies: #137982, #138016
Summary: Continue to clean up non-ABI-compatible mode related code. Differential Revision: [D64444327](https://our.internmc.facebook.com/intern/diff/D64444327) Pull Request resolved: #138047 Approved by: https://github.com/chenyang78 ghstack dependencies: #137982, #138016, #138009
|
I was trying to isolate an IMA caused in this test, but see it was removed and don't see the reason why this test was removed. PYTORCH_NO_CUDA_MEMORY_CACHING=1 cuda-gdb --args python inductor/test_aot_inductor.py -v -k test_torchvision_transforms_functional_tensor_resize_abi_compatible_cuda
...
r
...
CUDA Exception: Warp Illegal Address
The exception was triggered at PC 0x7ffcab79ac10 triton_poi_fused.to_copy.unsafe_index_add_arange_clamp_mul_sub_view_1 (cnzk43i2k6noh4w4ubolxat6sh2vn7i66ccpqdrc62y3yk7vo3so.py:76)
Thread 1 "python" received signal CUDA_EXCEPTION_14, Warp Illegal Address.
[Switching focus to CUDA kernel 0, grid 108, block (63510,0,0), thread (0,0,0), device 0, sm 0, warp 30, lane 0]
triton_poi_fused.to_copy.unsafe_index_add_arange_clamp_mul_sub_view_1<<<(187500,1,1),(128,1,1)>>> () at /tmp/tmpsz9gwx20/nz/cnzk43i2k6noh4w4ubolxat6sh2vn7i66ccpqdrc62y3yk7vo3so.py:76Are these kernels now tested in another unit or are we ignoring these issues? |
|
It's just a naming change in this case. |
|
Thanks for the info, @desertfire! When running a nightly from today with source from PYTORCH_NO_CUDA_MEMORY_CACHING=1 python inductor/test_aot_inductor.py -v -k test_torchvision_transforms_functional_tensor_resize_cuda -v
test_torchvision_transforms_functional_tensor_resize_cuda (__main__.AOTInductorTestABICompatibleCuda.test_torchvision_transforms_functional_tensor_resize_cuda) ... ERROR
======================================================================
ERROR: test_torchvision_transforms_functional_tensor_resize_cuda (__main__.AOTInductorTestABICompatibleCuda.test_torchvision_transforms_functional_tensor_resize_cuda)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/workspace/src/pytorch/test/inductor/test_aot_inductor.py", line 3686, in setUp
torch.ops.load_library(str(lib_file_path))
File "/usr/local/lib/python3.12/dist-packages/torch/_ops.py", line 1357, in load_library
ctypes.CDLL(path)
File "/usr/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /usr/local/lib/python3.12/dist-packages/torch/build/lib/libaoti_custom_ops.so: cannot open shared object file: No such file or directory
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (errors=1)So I guess I might need to execute the non-CUDA test before to build the actual lib, which fails with: PYTORCH_NO_CUDA_MEMORY_CACHING=1 python inductor/test_aot_inductor.py -v -k test_torchvision_transforms_functional_tensor_resize -v
test_torchvision_transforms_functional_tensor_resize_cpu (__main__.AOTInductorTestABICompatibleCpu.test_torchvision_transforms_functional_tensor_resize_cpu) ... ERROR
test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation (__main__.AOTInductorTestABICompatibleCpuWithStackAllocation.test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation) ... ERROR
test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation_and_minimal_arrayref_interface (__main__.AOTInductorTestABICompatibleCpuWithStackAllocationAndMinimalArrayRefInterface.test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation_and_minimal_arrayref_interface) ... In file included from /usr/local/lib/python3.12/dist-packages/torch/include/torch/csrc/inductor/aoti_runtime/arrayref_tensor.h:3,
from /tmp/tmppo5aidep/cj3v2c67ufgabywfqrvifllhohdm3jwuizd3erl2vwvyo6acesld/ccohu4kz2drtkxu47qvmlcjb3zmgk2ve463tdwlfnbqcsy2vcnsn.cpp:2:
/tmp/tmppo5aidep/cj3v2c67ufgabywfqrvifllhohdm3jwuizd3erl2vwvyo6acesld/ccohu4kz2drtkxu47qvmlcjb3zmgk2ve463tdwlfnbqcsy2vcnsn.cpp: In member function ‘Outputs torch::aot_inductor::AOTInductorModel::run_impl_minimal_arrayref_interface(const Inputs&, torch::aot_inductor::DeviceStreamType, AOTIProxyExecutorHandle) [with Inputs = std::tuple<torch::aot_inductor::ArrayRefTensor<float>, torch::aot_inductor::ArrayRefTensor<long int> >; Outputs = std::tuple<torch::aot_inductor::ArrayRefTensor<float> >; torch::aot_inductor::DeviceStreamType = void*; AOTIProxyExecutorHandle = AOTIProxyExecutorOpaque*]’:
/tmp/tmppo5aidep/cj3v2c67ufgabywfqrvifllhohdm3jwuizd3erl2vwvyo6acesld/ccohu4kz2drtkxu47qvmlcjb3zmgk2ve463tdwlfnbqcsy2vcnsn.cpp:769:54: error: cannot convert ‘torch::aot_inductor::ArrayRefTensor<float>’ to ‘AtenTensorHandle’ {aka ‘AtenTensorOpaque*’}
769 | AOTI_TORCH_ERROR_CODE_CHECK(aoti_torch_get_sizes(arg0_1, &arg0_1_size));
| ^~~~~~
| |
| torch::aot_inductor::ArrayRefTensor<float>In any case, unrelated to this PR and we should follow up in another issue to discuss how to execute this test as I might miss something. |
@angelayi , liaoti_custom_ops will only be built when |
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @chauhang
Differential Revision: D64439674