[Inductor] Refactor "r" reduction prefix to {"r0_", "r1_"}. #142020

blaine-rister · 2024-12-04T00:45:48Z

Preparatory refactor for #137243.

Feature

This PR changes the RINDEX / "r" symbol type to (R0_INDEX, R1_INDEX) and ("r0_", "r1_"), respectively. This allows the relevant code to support 2D (often ND) reductions. Unlike the parent PR, this one does not change the tiling algorithm, so "r1_" is never used. However, it prepares other parts of the system to handle "r1_" once we start using it. This should significantly reduce the chances of hitting merge conflicts, making the parent PR much easier to land.

The only change to the generated triton code is to rename "rindex" -> "r0_index", "RBLOCK" -> "R0_BLOCK", etc. To maintain compatibilty with existing codegen, this also generates aliases to the old reduction variables like rindex = r0_index. If we generated 2D reductions (which this PR will not do), the aliases would be more complicated and would collapse 2D multi-indices to linear indices. See some example kernels in the parent PR.

These aliases can be eliminated by the Triton compiler, and should not impact the final machine code running on the GPU. See the perf testing in the parent PR which confirms the aliases do not impact perf.

Test plan

The existing CI provides good coverage. This PR modifies the expected code in a few places, renaming reduction variables from r.* to r0_.*.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

… brister/tiling_dict

pytorch-bot · 2024-12-04T00:45:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142020

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3e133ae with merge base d3d1a78 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/_inductor/codegen/simd.py

jansel · 2024-12-04T22:42:03Z

torch/_inductor/codegen/simd.py


    def initialize_range_tree(self, pid_cache):
-        no_r_dim = not self.inside_reduction or self.numels[-1] == 1
+        prefixes = OrderedSet(["z", "y", "x", "r0_", "r1_"])


Move this to global scope for slightly less espensive runtime.

Changed 95704ec

torch/_inductor/codegen/simd.py

torch/_inductor/codegen/triton.py

jansel · 2024-12-04T22:49:25Z

torch/_inductor/codegen/triton.py

+            sympy_product(rn_numels[idx + 1 :]) for idx in range(len(rn_prefixes) - 1)
+        ] + [sympy.Integer(1)]
+
+    def _flatten_reduction_inds(self, multi_inds: List[sympy.Expr]) -> sympy.Expr:


full words for function names

Changed 417bc9f

jansel · 2024-12-04T22:49:40Z

torch/_inductor/codegen/triton.py

+        coeffs = self._get_reduction_index_coeffs()
+        return sympy_dot(coeffs, multi_inds)
+
+    def codegen_reduction_inds(self, buffer) -> None:


full words for function names

Changed 417bc9f

jansel · 2024-12-04T22:53:07Z

torch/_inductor/runtime/coordinate_descent_tuner.py


    def prefix_to_size_hint(self, prefix: str) -> Optional[int]:
-        size_hint_idx = {"X": 0, "Y": 1, "Z": 2, "R": -1}[prefix]
+        size_hint_idx = {"X": 0, "Y": 1, "Z": 2, "R0_": -1, "R1_": -2}[prefix]


Isn't -2 wrong? Or is the order swapped?

size_hints = [x, r0, r1] size_hints[-2] is r0 size_hints[-1] is r1

However:

size_hints = [x, r0] size_hints[-1] is r0

So you actually need a different index based on the tiling now...

I think you're right about this. This does seem wrong. I'll add some unit tests to confirm.

On second thought, this problem is exactly the reason why we made numels into a dict in #141751. Since size_hints is basically a rounded up version of numels, it seems like that should be a dict as well. That would make this code much simpler. I think I'll open a separate PR for that.

Created a separate PR for the size hints refactor #142249. I'll revisit this one once that lands.

The size_hints PR landed. Merged with this one in 9573db1.

Co-authored-by: Jason Ansel <[email protected]>

facebook-github-bot · 2024-12-12T16:39:03Z

@pytorchbot merge -i