-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[inductor] Allow cooperative + persistent reductions #138533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138533
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 431bda7 with merge base 07b0d63 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| # The RSPLIT of cooperative reductions means each thread block is operating on fewer elements | ||
| xnumel, _ = self.numels | ||
| try: | ||
| threshold *= 32 // V.graph.sizevars.size_hint(xnumel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So 32 // V.graph.sizevars.size_hint(xnumel) is used to estimate RSPLIT?
Is it expected that this is not that consistent to the similar logic in triton_heuristics.cooperative_reduction :
split = max(1, min(target // xnumel, TRITON_MAX_RSPLIT))
target is 64 there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, though I'm intentionally under-estimating here to be conservative.
Pull Request resolved: #138893 Approved by: https://github.com/shunting314 ghstack dependencies: #138533
Pull Request resolved: pytorch#138893 Approved by: https://github.com/shunting314 ghstack dependencies: pytorch#138533
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov