[invoke_subgraph] Fake tensor prop caching #149087

anijain2305 · 2025-03-13T00:12:22Z

Stack from ghstack (oldest at bottom):

Redoing #137808

[ghstack-poisoned]

pytorch-bot · 2025-03-13T00:12:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/149087

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 4386a6c with merge base ce54c43 ():

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (cpu_inductor_torchbench, 1, 2, linux.8xlarge.amx) (gh) (#149977)
detectron2_fcos_r_50_fpn
inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (dynamic_cpu_inductor_torchbench, 1, 2, linux.8xlarge.amx) (gh) (#149979)
detectron2_fcos_r_50_fpn
pull / cuda12.4-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh) (#149370)
REGRESSION: benchmark ('aotdispatcher_partitioner_cpu2', 'compile_time_instruction_count') failed, actual result 1745977744 is 1.75% higher than expected 1716000000 ±+1.50% if this is an expected regression, please update the expected results.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: 9c558f4 Pull Request resolved: #149087

[ghstack-poisoned]

ghstack-source-id: d8cff23 Pull Request resolved: #149087

zou3519 · 2025-03-19T14:11:39Z

torch/_higher_order_ops/utils.py

+registered_hop_fake_fns: dict[torch._ops.OpOverload, Callable] = {}
+
+
+def register_hop_fake(hop, fn=None):


nit: call this just register_fake? the fqn already has HOP in it (torch._higher_order_ops.register_fake)

zou3519 · 2025-03-19T14:14:03Z

torch/_higher_order_ops/utils.py

+
+    def __hash__(self):
+        return id(self.subgraph)
+


nit: Can you add some sort of repr so we know what we're dealing with when we're debugging?

zou3519 · 2025-03-19T14:15:18Z

torch/_subclasses/fake_tensor.py

                    # For debugging / testing: Validate that the output synthesized
                    # from the cache matches the output created by normal dispatch.
-                    self._crosscheck_cache_output(output, func, types, args, kwargs)
+                    with disable_fake_tensor_cache(self):


btw, when is self.cache_crosscheck_enabled set to True? I assume it is set to True somewhere in testing which is why this change was necessary

Yes, in test_fake_tensor.py with

torch._dynamo.config.fake_tensor_cache_crosscheck_enabled = True

zou3519 · 2025-03-19T14:19:30Z

torch/_subclasses/fake_tensor.py

        # caching implementation, e.g., data dependent ops or ops that modify
        # the inputs.
+        from torch._higher_order_ops.utils import registered_hop_fake_fns
+


type of func is now wrong, need to update it to be a union

So, I tried this earlier and this causes a bunch of mypy failures because now func is expecting tags and other attributes all over the place, which are not present in HigherOrderOperator.

yeah don't worry about it

zou3519 · 2025-03-19T14:23:09Z

torch/_subclasses/fake_tensor.py

+            isinstance(func, torch._ops.HigherOrderOperator)
+            and func in registered_hop_fake_fns


I'm not sure this is entirely correct, if the result of the HOP has data-dependent output shape or dynamic output shape then we need to bail out (

pytorch/torch/_subclasses/fake_tensor.py

Lines 1444 to 1448 in 2fcfae7

if torch.Tag.data_dependent_output in func.tags:

raise _BypassDispatchCache("data dependent output")

if torch.Tag.dynamic_output_shape in func.tags:

raise _BypassDispatchCache("dynamic output shape")

)

Now, I remember why I needed a validator function. Ah.

Adding new logic that goes through the subgraph nodes and checks each of them.

zou3519 · 2025-03-19T14:25:57Z

torch/_subclasses/fake_tensor.py

+            elif isinstance(arg, torch.fx.GraphModule):
+                # This is used for invoke_subgraph where id(graph_module) allows
+                # us to cache fake outputs
+                result.append(type(arg))
+                result.append(id(arg))
+            elif isinstance(arg, FunctionalizeCtxWrapper):
+                result.append(hash(arg))


umm, does id(arg) assume that the GraphModule stays alive forever? (What if the GraphModule gets deallocated and another one gets allocated in its stead?)

We might need some cache invalidation mechanism via weakref.finalize

Doing it here - #149667

zou3519

I think we forgot to discuss the recursive case (we did talk about it half a year ago, and I am remembering it now): what happens if invoke_subgraph has a subgraph where there is an operator that is not eligible for FakeTensor caching?

We shouldn't allow that invoke_subgraph to be cached. There's probably an efficient strategy for checking this, like during FakeTensorProp for invoke_subgraph we should do the FakeTensorProp for the subgraph first and then if that didn't have any ineligible operators then we say the invoke_subgraph can be cached.

Redoing #137808 [ghstack-poisoned]

zou3519

tests failing

zou3519

LGTM

Redoing #137808 [ghstack-poisoned]

pytorchmergebot · 2025-03-26T23:55:43Z

Starting merge as part of PR stack under #150036

…jects (#149667) Pull Request resolved: #149667 Approved by: https://github.com/zou3519 ghstack dependencies: #149087

Pull Request resolved: #150036 Approved by: https://github.com/angelayi ghstack dependencies: #149087, #149667

…#148953) Pull Request resolved: #148953 Approved by: https://github.com/zou3519 ghstack dependencies: #149087, #149667, #150036

Pull Request resolved: #150090 Approved by: https://github.com/eellison, https://github.com/zou3519 ghstack dependencies: #149087, #149667, #150036, #148953

Redoing pytorch#137808 Pull Request resolved: pytorch#149087 Approved by: https://github.com/zou3519

…jects (pytorch#149667) Pull Request resolved: pytorch#149667 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#149087

Pull Request resolved: pytorch#150036 Approved by: https://github.com/angelayi ghstack dependencies: pytorch#149087, pytorch#149667

…pytorch#148953) Pull Request resolved: pytorch#148953 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#149087, pytorch#149667, pytorch#150036

Pull Request resolved: pytorch#150090 Approved by: https://github.com/eellison, https://github.com/zou3519 ghstack dependencies: pytorch#149087, pytorch#149667, pytorch#150036, pytorch#148953

ghstack-source-id: e1a91e0 Pull Request resolved: pytorch/pytorch#149087

[invoke_subgraph] Fake tensor prop caching

d490278

[ghstack-poisoned]

anijain2305 requested a review from zou3519 as a code owner March 13, 2025 00:12

pytorch-bot bot added the ciflow/inductor label Mar 13, 2025

Update on "[invoke_subgraph] Fake tensor prop caching"

c4dc04e

[ghstack-poisoned]

Update on "[invoke_subgraph] Fake tensor prop caching"

d822765

[ghstack-poisoned]

anijain2305 added a commit that referenced this pull request Mar 13, 2025

[invoke_subgraph] Fake tensor prop caching

edda053

ghstack-source-id: 9c558f4 Pull Request resolved: #149087

anijain2305 added the topic: not user facing topic category label Mar 18, 2025

Update on "[invoke_subgraph] Fake tensor prop caching"

d657177

[ghstack-poisoned]

anijain2305 added a commit that referenced this pull request Mar 18, 2025

[invoke_subgraph] Fake tensor prop caching

435fa83

ghstack-source-id: d8cff23 Pull Request resolved: #149087

zou3519 reviewed Mar 19, 2025

View reviewed changes

Update on "[invoke_subgraph] Fake tensor prop caching"

3e4d4e5

Redoing #137808 [ghstack-poisoned]

anijain2305 mentioned this pull request Mar 20, 2025

[invoke_subgraph][fake tensor cache] Add a finalizer for id hashed objects #149667

Closed

Update on "[invoke_subgraph] Fake tensor prop caching"

8fceaae

Redoing #137808 [ghstack-poisoned]

anijain2305 requested a review from zou3519 March 20, 2025 21:13

zou3519 reviewed Mar 24, 2025

View reviewed changes

zou3519 approved these changes Mar 24, 2025

View reviewed changes

anijain2305 added 2 commits March 25, 2025 15:55

Update on "[invoke_subgraph] Fake tensor prop caching"

b508d81

Redoing #137808 [ghstack-poisoned]

Update on "[invoke_subgraph] Fake tensor prop caching"

88b1f19

Redoing #137808 [ghstack-poisoned]

anijain2305 mentioned this pull request Mar 26, 2025

[easy] Use config patch to toggle capture_scalar_output #150036

Closed

Update on "[invoke_subgraph] Fake tensor prop caching"

4386a6c

Redoing #137808 [ghstack-poisoned]

anijain2305 mentioned this pull request Mar 26, 2025

[dynamo][invoke_subgraph] Input aliasing and mutation check in Dynamo #148953

Closed

pytorchmergebot closed this in a7596b4 Mar 27, 2025

pytorchmergebot added the Merged label Mar 27, 2025

This was referenced Mar 27, 2025

[invoke_subgraph] Support None in the fwd output #150082

Closed

[inductor] No type promotion for slice_scatter #150090

Closed

Divigroup-RAP pushed a commit to Divigroup-RAP/PYTORCH that referenced this pull request Apr 22, 2025

[invoke_subgraph] Fake tensor prop caching

1dac449

ghstack-source-id: e1a91e0 Pull Request resolved: pytorch/pytorch#149087

github-actions bot deleted the gh/anijain2305/700/head branch May 2, 2025 02:16

		registered_hop_fake_fns: dict[torch._ops.OpOverload, Callable] = {}


		def register_hop_fake(hop, fn=None):

		isinstance(func, torch._ops.HigherOrderOperator)
		and func in registered_hop_fake_fns

	if torch.Tag.data_dependent_output in func.tags:
	raise _BypassDispatchCache("data dependent output")

	if torch.Tag.dynamic_output_shape in func.tags:
	raise _BypassDispatchCache("dynamic output shape")

[invoke_subgraph] Fake tensor prop caching #149087

[invoke_subgraph] Fake tensor prop caching #149087

Uh oh!

Conversation

anijain2305 commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/149087

✅ You can merge normally! (3 Unrelated Failures)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Mar 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

anijain2305 commented Mar 13, 2025 •

edited

Loading

pytorch-bot bot commented Mar 13, 2025 •

edited

Loading