[TEST] Modernize test_sort_large #155546

Aidyn-A · 2025-06-10T07:34:07Z

Since its introduction ~4 years ago, the test test_sort_large has always been deselected because it requires 200GB of CUDA memory. Now, as we do have GPUs this big, it gets selected, but fails with var_mean not being a member if torch.Tensor and var_mean accepting only floating point tensors.

pytorch-bot · 2025-06-10T07:34:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155546

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 2 Unrelated Failures

As of commit fea7174 with merge base 3863bbb ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-jammy-cuda12.6-py3.10-gcc11-sm89 / test (default, 3, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh) (similar failure)
test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_int16

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable) (gh)
exir/backend/test/test_to_backend_multi_method.py::TestToBackendMultiMethod::test_multi_method_end_to_end

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zou3519 · 2025-06-10T13:35:17Z

test/test_sort_and_select.py

+        self.assertEqual(vm, torch.arange(8192, dtype=dtype, device=device))
+        self.assertEqual(im, t0.sort().indices, exact_dtype=False)


Do we this test in CI?

No, CI machines are not that big, but I have tested it on GB300 which has 288GB of memory.

Aidyn-A · 2025-06-10T17:12:08Z

@pytorchbot merge

pytorchmergebot · 2025-06-10T17:14:11Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Currently std::min -> ::min did not work as expected on ROCm when input values >= 2147483648 Replace std::min to ternary statement Also std::min can be replaced by explicit typing std::min<int64_t> fixes on ROCm: test_sort_and_select.py::TestSortAndSelectCUDA::test_sort_large_cuda_float16 error: RuntimeError: Cannot sort dimension of length 8192 Combines upstream PRs: - pytorch#161054 to fix std::min on ROCm - pytorch#155546 fix python test - pytorch#159939 change test dtype from int8 to float16 Fixes: SWDEV-526432

modernize test_sort_large

fea7174

pytorch-bot bot added the topic: not user facing topic category label Jun 10, 2025

Aidyn-A requested a review from malfet June 10, 2025 07:34

pytorchbot added the open source label Jun 10, 2025

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 10, 2025

zou3519 reviewed Jun 10, 2025

View reviewed changes

eqy approved these changes Jun 10, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 10, 2025

pytorchmergebot added the merging label Jun 10, 2025

pytorchmergebot added the Merged label Jun 10, 2025

pytorchmergebot closed this in 0ca2a79 Jun 10, 2025

pytorchmergebot removed the merging label Jun 10, 2025

dnikolaev-amd mentioned this pull request Aug 21, 2025

[rocm7.1_internal_testing] fix large tensor sort on ROCm ROCm/pytorch#2543

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TEST] Modernize test_sort_large #155546

[TEST] Modernize test_sort_large #155546

Uh oh!

Aidyn-A commented Jun 10, 2025

Uh oh!

pytorch-bot bot commented Jun 10, 2025 •

edited

Loading

Uh oh!

zou3519 Jun 10, 2025

Uh oh!

Aidyn-A Jun 10, 2025 •

edited

Loading

Uh oh!

Aidyn-A commented Jun 10, 2025

Uh oh!

pytorchmergebot commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		self.assertEqual(vm, torch.arange(8192, dtype=dtype, device=device))
		self.assertEqual(im, t0.sort().indices, exact_dtype=False)

[TEST] Modernize test_sort_large #155546

[TEST] Modernize test_sort_large #155546

Uh oh!

Conversation

Aidyn-A commented Jun 10, 2025

Uh oh!

pytorch-bot bot commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155546

⏳ 1 Pending, 2 Unrelated Failures

Uh oh!

zou3519 Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Aidyn-A Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aidyn-A commented Jun 10, 2025

Uh oh!

pytorchmergebot commented Jun 10, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Jun 10, 2025 •

edited

Loading

Aidyn-A Jun 10, 2025 •

edited

Loading