[inductor] Fix FakeTensorUpdater handling of HOPs #159523

benjaminglass1 · 2025-07-30T22:50:06Z

Stack from ghstack (oldest at bottom):

-> [inductor] Fix FakeTensorUpdater handling of HOPs #159523

Ensures that any subgraphs on a GraphModule are updated by FakeTensorUpdater.
Ensures that any users of those subgraphs (i.e. invoke_subgraph) also get updated appropriately.
Enables processing of HOPs by FakeTensorUpdater.
Adds tests for the use of HOPs within FakeTensorUpdater managed graphs.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos

[ghstack-poisoned]

pytorch-bot · 2025-07-30T22:50:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159523

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job, 1 Unrelated Failure

As of commit 621684a with merge base 20cae80 ():

CANCELLED JOB - The following job was cancelled. Please retry:

Limited CI on H100 / linux-jammy-cuda12_8-py3_10-gcc11-sm90-FA3-ABI-stable-test / test (gh)
##[error]The operation was canceled.

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable) (gh) (#166072)
extension/llm/custom_ops/test_sdpa_with_kv_cache.py::SDPATest::test_sdpa_with_cache_no_mqa_3

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Fixes #156819 ghstack-source-id: 1bd19d6 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes #156819 ghstack-source-id: e9e0d9a Pull Request resolved: #159523

[ghstack-poisoned]

Fixes #156819 ghstack-source-id: 8f96c2c Pull Request resolved: #159523

[ghstack-poisoned]

Fixes #156819 ghstack-source-id: 814e074 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes #156819 ghstack-source-id: f244405 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes #156819 ghstack-source-id: e8c0ac5 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 2aff2b5 Pull Request resolved: #159523

test/inductor/test_utils.py

torch/_inductor/fx_utils.py

eellison · 2025-08-22T20:02:10Z

@zou3519 said he would review - deferring to him on this one.

Fixes [#156819](#156819) ghstack-source-id: 7a45b59 Pull Request resolved: #159523

benjaminglass1 · 2025-10-21T23:50:21Z

Spoke with @zou3519 offline, and concluded that the requests for better testing can be done in follow-up PRs, given the existing lack of testing for FakeTensorUpdater, so that we can get the UB fixed.

zou3519

The example_val thing is still suspicious to me. If you could send an example test case where you saw it that would be helpful

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 08a1878 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 1f3208c Pull Request resolved: #159523

benjaminglass1 · 2025-11-01T02:49:12Z

Pushed a version that updates subgraphs with new placeholder arguments when needed, at the time we process the subgraph invocation. It appears to work, but I've run into a new wrinkle: it looks like in some cases (and I've modified the test to reflect this) the subgraph can be called multiple times. I think this may invalidate the idea that we can change the shape and stride of inputs and outputs, at least trivially. Minimally, we need to skip doing this to subgraphs that are invoked repeatedly; maximally, perhaps we should just throw a loud error when changing placeholders or outputs on a graph?

benjaminglass1 · 2025-11-08T01:45:14Z

Writing up where this is at, for visibility:

I've found cases where the same subgraph member gets reused multiple times within a graph. This means that we cannot unconditionally update inputs to a subgraph with differently shaped FakeTensors, since that could happen with different shapes at different points in the graph.
I've (locally) added checks that we aren't updating the same subgraph twice, but these checks have been growing more complicated as I try to avoid triggering them for tensors with the same shape/stride/etc (since in those cases it's entirely valid to reuse the subgraph).
The final obstacle is that simply reusing the subgraph.output_node().meta["val"] tensors to represent the output of the subgraph results in apparent aliasing where there should not be any. This may mean that we always need to re-run FakeTensor propagation through subgraphs, which I would prefer to avoid for computational reasons. I'm still working on a reliable way to copy those updated tensors without losing real aliasing relationships.

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 334ed1d Pull Request resolved: #159523

benjaminglass1 · 2025-11-08T18:16:38Z

Update: this is very close now; the only remaining obstacle (pending the test run working, obviously) is handling HOPs that did not initially appear to utilize subgraphs, like torch.cond. These also need to be updated, so this code needs to become more generalized to handle different arg-passing formats.

EDIT: @zou3519 re-requesting your review to look at the current approach and make sure there's no obvious issues I overlooked.

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: ebb2943 Pull Request resolved: #159523

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 56d4ded Pull Request resolved: #159523

benjaminglass1 · 2025-11-14T20:45:06Z

torch/_inductor/fx_utils.py

+        def extract_subgraphs_and_args(
+            node: torch.fx.Node, *args: Any, **kwargs: Any
+        ) -> tuple[tuple[torch.fx.GraphModule, ...], tuple[Any, ...] | None]:
+            """HOPs that invoke subgraphs take a number of different forms.  This
+            function regularizes them, returning a tuple of subgraphs contained in the
+            args and a tuple of the args for the subgraphs.  This function assumes all
+            subgraphs share a set of common arguments.
+
+            This function assumes that node_invokes_subgraph(node, *args, **kwargs) is
+            True.
+
+            If the second return value is None, this function was unable to determine
+            what args to pass to the subgraph(s)."""
+            if node.target is torch.ops.higher_order.cond:
+                return tuple(args[1:3]), tuple(args[3])
+            if node.target is torch.ops.higher_order.foreach_map:
+                return (args[0],), tuple(args[1:])
+            if node.target in (
+                torch.ops.higher_order.invoke_quant_packed,
+                torch.ops.higher_order.invoke_quant,
+            ):
+                return (args[0],), tuple(args[1:])
+            if node.target is torch.ops.higher_order.invoke_subgraph:
+                return (args[0],), tuple(args[2:])
+            if node.target is torch.ops.higher_order.map_impl:
+                # map is applied over slices from the first dimension of each value in
+                # args[1].
+                return (args[0],), (*(a[0] for a in args[1]), *args[2:])
+            if node.target in (
+                torch.ops.higher_order.while_loop,
+                torch.ops.higher_order.while_loop_stack_output,
+            ):
+                return tuple(args[:2]), (*args[2], *args[3])
+            if node.target is control_deps:
+                assert not kwargs, (
+                    "Subgraph arguments can be renamed, so we cannot consistently "
+                    "handle kwargs at this point in the stack."
+                )
+                return (args[1],), tuple(args[2:])
+            # These functions don't have clean mappings from node arguments to subgraph
+            # inputs, since those mappings are dependent on details of the original
+            # invocation that are not preserved.  Skip them intentionally.
+            if node.target not in (
+                torch.ops.higher_order.associative_scan,
+                torch.ops.higher_order.flex_attention,
+                torch.ops.higher_order.scan,
+            ):
+                warnings.warn(
+                    f"Please add support for subgraph args to function {node.target}!"
+                )
+
+            # By default, just return the detected list of subgraphs so that we can run
+            # updates on all of them.
+            return tuple(
+                s
+                for s in pytree.tree_flatten(args)
+                if isinstance(s, torch.fx.GraphModule) and s in self.subgraph_updaters
+            ), None


I do not like how this turned out, but I'm not sure of any other way to do this. Every HOP seems to pass the subgraph args through the call in a different way.

I see the problem

I'm not really sure what to do about this. Maybe we can ship a version of this PR first where we don't update the inside of the HOP if the outside of the HOP changes. If we really need this, then we need a way for each HOP to register how to do a FakeTensorUpdater on it, which is really annoying. Thoughts @eellison?

Actually, I think this is fine. Let's say we have y = invoke_subgraph(subgraph, x). The question is if we need to correct the faketensors in subgraph given a new x.

At the very least, we need to change y, given a new x.

What does "a new x" mean? I'm assuming it has the same shape, but different strides. If it has different shape, then I'd expect the user needs to change the subgraph themselves or the subgraph is trivial.

The subgraph should be resilient to changes in strides. If there is a custom operator that depends on the stride being a certain way, then it will emit some code that will coerce the strides to be what it expects

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 039368d Pull Request resolved: #159523

torch/_inductor/fx_utils.py

[ghstack-poisoned]

Fixes [#156819](#156819) ghstack-source-id: 3da1a42 Pull Request resolved: #159523

test/inductor/test_fp8.py

zou3519 · 2025-12-09T02:52:14Z

torch/_inductor/fx_utils.py

+            strict: disabling this flag will cause this function to only evaluate size,
+            layout, stride, and device.  This is used to validate that arguments are
+            equivalent enough for updating subgraphs."""


when do you use strict vs not strict? comment could be clearer on this

@zou3519 I'll clarify the comment.

Update

6f810a4

[ghstack-poisoned]

benjaminglass1 mentioned this pull request Jul 30, 2025

[cpp_wrapper] Build main and kernel code in separate threads #154551

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Jul 30, 2025

benjaminglass1 added a commit that referenced this pull request Jul 30, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

bc4c8d3

Fixes #156819 ghstack-source-id: 1bd19d6 Pull Request resolved: #159523

Update

7385780

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Jul 30, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

233ea34

Fixes #156819 ghstack-source-id: e9e0d9a Pull Request resolved: #159523

benjaminglass1 added the release notes: inductor label Jul 30, 2025

pytorchbot added the open source label Jul 30, 2025

Update

f8132a9

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Jul 31, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

d9bf9b0

Fixes #156819 ghstack-source-id: 8f96c2c Pull Request resolved: #159523

Update

3ed7bd8

[ghstack-poisoned]

benjaminglass1 mentioned this pull request Aug 1, 2025

[typing] Constrain OrderedSet generic to be Hashable #159684

Closed

benjaminglass1 added a commit that referenced this pull request Aug 1, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

c920a3c

Fixes #156819 ghstack-source-id: 814e074 Pull Request resolved: #159523

benjaminglass1 self-assigned this Aug 1, 2025

benjaminglass1 changed the title ~~[inductor] Fix FakeTensorUpdater handling of HOPs~~ [WIP][inductor] Fix FakeTensorUpdater handling of HOPs Aug 2, 2025

Update

2b5827d

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Aug 2, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

0ea45c9

Fixes #156819 ghstack-source-id: f244405 Pull Request resolved: #159523

Update

7b70291

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Aug 4, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

0bf2ae5

Fixes #156819 ghstack-source-id: e8c0ac5 Pull Request resolved: #159523

Update

a34af85

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Aug 14, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

7e13389

Fixes [#156819](#156819) ghstack-source-id: 2aff2b5 Pull Request resolved: #159523

benjaminglass1 commented Aug 14, 2025

View reviewed changes

benjaminglass1 requested a review from eellison August 14, 2025 22:30

benjaminglass1 marked this pull request as ready for review August 14, 2025 22:30

benjaminglass1 changed the title ~~[WIP][inductor] Fix FakeTensorUpdater handling of HOPs~~ [inductor] Fix FakeTensorUpdater handling of HOPs Aug 14, 2025

eellison requested a review from zou3519 August 15, 2025 21:24

eellison removed their request for review August 22, 2025 20:02

benjaminglass1 added a commit that referenced this pull request Oct 17, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

0ed94f5

Fixes [#156819](#156819) ghstack-source-id: 7a45b59 Pull Request resolved: #159523

benjaminglass1 requested a review from zou3519 October 21, 2025 23:50

zou3519 reviewed Oct 24, 2025

View reviewed changes

Update

a13aa81

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Oct 29, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

d2199fe

Fixes [#156819](#156819) ghstack-source-id: 08a1878 Pull Request resolved: #159523

Update

8a63934

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 1, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

f90955d

Fixes [#156819](#156819) ghstack-source-id: 1f3208c Pull Request resolved: #159523

Update

f00161d

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 8, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

f9580d2

Fixes [#156819](#156819) ghstack-source-id: 334ed1d Pull Request resolved: #159523

benjaminglass1 requested a review from zou3519 November 8, 2025 18:17

Update

a7f1640

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 12, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

e7a2ece

Fixes [#156819](#156819) ghstack-source-id: ebb2943 Pull Request resolved: #159523

Update

ae70029

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 14, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

fce3cf6

Fixes [#156819](#156819) ghstack-source-id: 56d4ded Pull Request resolved: #159523

benjaminglass1 added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 14, 2025

benjaminglass1 commented Nov 14, 2025

View reviewed changes

Update

245a78d

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 18, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

435c704

Fixes [#156819](#156819) ghstack-source-id: 039368d Pull Request resolved: #159523

benjaminglass1 commented Nov 19, 2025

View reviewed changes

torch/_inductor/fx_utils.py Outdated Show resolved Hide resolved

Update

621684a

[ghstack-poisoned]

benjaminglass1 added a commit that referenced this pull request Nov 19, 2025

[inductor] Fix FakeTensorUpdater handling of HOPs

15f5d62

Fixes [#156819](#156819) ghstack-source-id: 3da1a42 Pull Request resolved: #159523

pytorch-bot bot added ciflow/b200 ciflow/h100 ciflow/rocm Trigger "default" config CI on ROCm labels Nov 19, 2025

benjaminglass1 commented Nov 19, 2025

View reviewed changes

test/inductor/test_fp8.py Show resolved Hide resolved

zou3519 reviewed Dec 9, 2025

View reviewed changes

[inductor] Fix FakeTensorUpdater handling of HOPs #159523

Are you sure you want to change the base?

[inductor] Fix FakeTensorUpdater handling of HOPs #159523

Uh oh!

Conversation

benjaminglass1 commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159523

❌ 1 Cancelled Job, 1 Unrelated Failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eellison commented Aug 22, 2025

Uh oh!

benjaminglass1 commented Oct 21, 2025

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

benjaminglass1 commented Nov 1, 2025

Uh oh!

benjaminglass1 commented Nov 8, 2025

Uh oh!

benjaminglass1 commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benjaminglass1 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zou3519 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

benjaminglass1 Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

benjaminglass1 commented Jul 30, 2025 •

edited

Loading

pytorch-bot bot commented Jul 30, 2025 •

edited

Loading

benjaminglass1 commented Nov 8, 2025 •

edited

Loading