Skip to content

Conversation

@blaine-rister
Copy link
Contributor

@blaine-rister blaine-rister commented Dec 31, 2024

Issue

This PR cleans up an edge case that wasn't handled by #137243. The existing tiling code assumes that node.get_ranges() is a reliable source of pointwise and reduction numels. This is true for pointwise kernels, but the situation is more complicated with reductions. Since reductions change the number of elements in a tensor, not all ops within a reduction kernel will have the same number of iterations. For example, var_mean fuses pointwise division with the output of reduction sum, and the division lacks the corresponding reduction ranges.

Fix

Instead of getting numels from node.get_ranges(), explicitly pass the global pointwise and reduction numels to the relevant tiling functions. In SIMDKernel.complete_partial_tiling, we solve for the missing numel by diving the global numel by the partial tiling's numel. This ensures all tilings have the correct global numel.

Also, in SIMDKernel.is_compatible, add the global reduction numel to node ranges that are missing it. For example, {"x": 8, "r0_": 8} is compatible with a node of ranges ([8], []) when we have reduction_numel=8.

Finally, this PR generalizes some of the existing codegen to handle multiple reduction dims. We already had code to ignore reduction splits for pointwise kernels, but it only worked for 1D reductions. Now it can handle ND.

Test plan

This PR parametrizes the existing CI test for var_mean to also run with tiled reductions. It also adds a new test checking that var_mean generates 2D tilings (with tiled reduction enabled). These new tests would fail on the current main branch.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

@blaine-rister blaine-rister requested a review from jansel December 31, 2024 21:50
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 31, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144041

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6cfdda3 with merge base a174ee2 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@blaine-rister blaine-rister force-pushed the brister/reduction_numel branch from b05cf37 to cb1179f Compare December 31, 2024 22:14
@blaine-rister blaine-rister changed the title [Inductor] [Inductor] Generalize tiling algorithm to handle fused reductions Dec 31, 2024
@blaine-rister blaine-rister marked this pull request as ready for review January 2, 2025 19:57
@blaine-rister
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 2, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@jansel
Copy link
Contributor

jansel commented Jan 3, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the brister/reduction_numel branch February 3, 2025 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants