Allow freezing of FunctionGraph for hashing by jessegrabowski · Pull Request #1908 · pymc-devs/pytensor

jessegrabowski · 2026-02-22T13:23:44Z

LLM disclosure: this PR made heavy use of Claude in the planning and first cut stages, though I was heavily involved. Still, the code should be subject to extra scrutiny as a result.

The purpose of the PR is to refactor Ops with inner graphs to allow comparison. The linked issue has an exhaustive discussion of the factors at play. There was an attempt in the aesara days to attack this, but it was perhaps too aggressive: it cons-hashed all Apply nodes, which necessitated changes across the codebase. @ricardoV94 suggested a weakref dict approach for subgraphs. This is implemented at the Op level. The plan is for Ops that have inner graphs (Composite, ScalarLoop, Scan, OpFromGraph, etc) to have a _cache class attribute, and implement the op-specific logic for caching, pickling, unpickling, etc. It didn't look super generalizable to me at first blush, but we can argue about it maybe.

Changes to FunctionGraph:

FunctionGraph now has a method freeze that returns a FrozenFunctionGraph.
The FrozenFunctionGraph does cons-hashing of Apply nodes within its scope only
It generates a hash based on its inner graph
Two FrozenFunctionGraphs with the same inner graph with evaluate to equal, but their Apply nodes won't be references to the same objects (this is the "conservatism" of my approach)

Specific implementation details:

The structural_hash of a FrozenFunctionGraph is built from a list of 3-tuples: (name, type, inputs), plus the outputs. For constants, inputs is replaced with the hash of the input data.
Equality between FrozenFunctionGraphs is done by comparing hashes, then falling back to equal_computation if the hash misses.

A consequence of the cons-hashing in this approach is that the inner graph is de-duplicated when we call fg.freeze(). So a MergeOptimizer pass is no longer required. Usage is demonstrated on the Composite Op. If we like the approach I can move forward with refactoring other Ops, but I wanted to stop here and discuss the approach.

Code example:

import pytensor.tensor as pt
import pytensor

a, b, c, d = pt.dscalars('a', 'b', 'c', 'd')
eq1 = pt.sin(a) * b ** 2
eq2 = pt.sin(c) * d ** 2

with pytensor.config.change_flags(optimizer_verbose=True):
    f = pytensor.function([a, b, c, d], [eq1, eq2])

f.dprint()

Result:

Composite{(sin(*0-<float64>) * sqr(*1-<float64>))} [id A] 1
 ├─ a [id B]
 └─ b [id C]
Composite{(sin(*0-<float64>) * sqr(*1-<float64>))} [id D] 0
 ├─ c [id E]
 └─ d [id F]

Inner graphs:

Composite{(sin(*0-<float64>) * sqr(*1-<float64>))} [id A]
 ← mul [id G]
    ├─ sin [id H]
    │  └─ *0-<float64> [id I]
    └─ sqr [id J]
       └─ *1-<float64> [id K]

Composite{(sin(*0-<float64>) * sqr(*1-<float64>))} [id D]
 ← mul [id G]
    └─ ···

ricardoV94

Why did you not go all out?

If you already deduplicate and do internal hash-cons you are one step away from getting hashing for free across different FunctionGraphs. Just do the hash-cons globally. Then FrozenFunctionGrahp([x, y], [foo(x, y)] is equal to another functiongraph if and only if fgraph.outputs == other_fgraph.outputs. No need for recursive hashing or expensive equal_computations.

As it stands you are not doing much better sneaking a default MergeOptimizer at __init__ and adding a FunctionGraph class that has no replace mode.

And cheap hashing/ equality is not just a nice to have, it's really valuable to not slow down compilation. In some of my benchmarks on previous work, some graphs could spend inordinate time on equality checks.

Comments regardless of whether we go:

Don't create FrozenFunctionGraph as a subclass of FrozenGraph, let's push the general principle, shared abstract classes, no-subclass of actually realized objects. Then you don't need check_frozen , the methods just don't exist for the frozen subclass.
You could create a frozenApply that uses tuple for input/outputs instead of list. That will help ensuring the immutability because all our current rewrite machinery works on the idea of overriding entries in those lists. Accidentally trying to mutate a graph would 99% fail there.

ricardoV94

This is starting to look good, how are you feeling about it?

Notes:

Add a FrozenFunctionGraph.unfreeze(), that yields a FunctionGraph?
Really try to avoid the FrozenConstant stuff
Ops with inner graph (at least the ones you touched now) should only have a FrozenFunctionGraph internally (not a mutable one as well). Maybe that's already the case.

We need some follow-up issues open:

Optimizing OpFromGraph: There should be an explicit rewrite that creates a new OpFromGraph with its updated frozen graph, (so it is also reflected immediately in dprint). We should never do any further rewrites of the internal fgraph during compilation.
Scan/Minimize/Root: Use the new FrozenFunctionGraph as well. This should immediately address #1601
When compiling OpFromGraph in jitted contexts we should try to avoid recreating inner numba/jax functions when the same OFG is compiled multiple times in a function, this will likely speedup compilation. In the C-backend that already happens due to the caching of _fn. That's how we can deliver on the promised compilations speedups and it's specially relevant for a library like pytensor-ml that may want to chains hundreds of the same "LayerOp"s in sequence

ricardoV94 · 2026-03-08T21:26:07Z

I left some comments as I checked the changes. I need to think/discuss a bit about the spec thing, and the desire to have a consistent hashing across runtimes. If you remove that the complexity of this PR drops quite a bit, but maybe this is also fine.

Can you confirm this was only needed for the C-backend, and that it would also work if whatever relies on that called something like __stable_hash__ instead of __hash__, that does the fingerprint / spec thing?

Besides that this PR look amazing, and it's a game changer to working with inner graph ops. We really need those to work well

jessegrabowski · 2026-03-09T00:15:33Z

I removed the spec stuff and simplified the PR down somewhat.

ricardoV94 · 2026-04-06T20:16:05Z

+        for i, out in enumerate(frozen_outputs):
+            out.name = f"o{i}"


I think this is wrong? The same variable could be output0 in one graph and output 2 in another? Or are these the dummy Output Ops we put in clients?

ricardoV94 · 2026-04-07T09:20:45Z

+        self.variables: frozenset[Variable] = frozenset(memo.values())
+        self.apply_nodes: frozenset[Apply] = frozenset(sorted_apply_nodes)
+        self._clients: dict[Variable, list[ClientType]] | None = None
+        self._toposort: tuple[Apply, ...] = tuple(sorted_apply_nodes)


I pre-computed these (except for clients), because we basically have everything we needed already from our loop.

I made them frozenset/tuple instead.

ricardoV94 · 2026-04-07T09:23:02Z

+    @property
+    def clients(self) -> dict[Variable, list[ClientType]]:  # type: ignore[override]
+        if self._clients is None:
+            clients: dict[Variable, list[ClientType]] = {v: [] for v in self.variables}


got rid of the setdefault in the inner loop, speeds up things a bit. We may end with more clients that before, for variables without nodes. I think this is much more robust.

One big difference though between this and the regular FunctionGraph is we don't have the dummy Output Apply in the clients of output vars. I think we should add

…cross

jessegrabowski requested a review from ricardoV94 February 22, 2026 13:23

jessegrabowski added enhancement New feature or request request discussion graph objects OpFromGraph labels Feb 22, 2026

ricardoV94 reviewed Feb 23, 2026

View reviewed changes

Comment thread pytensor/scalar/basic.py Outdated

jessegrabowski force-pushed the hashable-inner-graphs branch from 305e26e to 08609fe Compare February 25, 2026 10:37

ricardoV94 reviewed Feb 25, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

jessegrabowski force-pushed the hashable-inner-graphs branch 2 times, most recently from 78ee1a9 to eda51d2 Compare March 8, 2026 19:18

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/compile/builders.py

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 8, 2026

View reviewed changes

Comment thread pytensor/tensor/rewriting/elemwise.py Outdated

jessegrabowski force-pushed the hashable-inner-graphs branch from eda51d2 to 7202ca3 Compare March 9, 2026 00:15

jessegrabowski force-pushed the hashable-inner-graphs branch 3 times, most recently from 4a7bea8 to 445731f Compare March 9, 2026 00:48

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread pytensor/graph/basic.py

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread pytensor/graph/fg.py Outdated

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread pytensor/graph/op.py Outdated

ricardoV94 reviewed Mar 9, 2026

View reviewed changes

Comment thread tests/compile/test_builders.py

ricardoV94 reviewed Apr 6, 2026

View reviewed changes

ricardoV94 force-pushed the hashable-inner-graphs branch 5 times, most recently from a9ac55f to ae73a91 Compare April 7, 2026 09:18

ricardoV94 reviewed Apr 7, 2026

View reviewed changes

jessegrabowski and others added 16 commits April 7, 2026 19:48

Add FrozenApply

a86b48e

Add AbstractFunctionGraph and FrozenFunctionGraph

d5ff870

Dispatch AbstractFunctionGraph in linkers

87a3236

Update Composite to use FrozenFunctionGraph

ac5a19e

Update OpFromGraph to use FrozenFunctionGraph

1fbaff4

Update ScalarLoop to use FrozenFunctionGraph

a0845be

Update Scan to use FrozenFunctionGraph

16796f1

Check outputs for equality before inputs in FrozenFunctionGraph.__eq__

84106e0

Use props-based equality in ScalarInnerGraphOp

3f4c8de

Give NominalVariables pretty names and update dprint tests

04b305c

Avoid storing strong reference to objects in cache keys

ddbdcc6

Do not rename outputs and pre-compute variables we already iterated a…

ce74a09

…cross

Simplify mostly redundant check and add comment

446fd7f

Inputs may change after FrozenApply is created

b92a9be

Don't try to override interned inputs in the memo

6ad54be

mypy :)

8a77a26

jessegrabowski force-pushed the hashable-inner-graphs branch from bafdb84 to 8a77a26 Compare April 8, 2026 01:00

jessegrabowski and others added 2 commits April 7, 2026 20:04

Update print test

173938e

Re-order vars

7e09e0b

ricardoV94 merged commit e2f36d1 into pymc-devs:v3 Apr 8, 2026
64 checks passed

This was referenced Apr 8, 2026

Follow-up improvements to FrozenFunctionGraph infrastructure #2033

Open

Implement OpFromGraph __eq__ and __hash__ #1114

Closed

Conversation

jessegrabowski commented Feb 22, 2026

Uh oh!

ricardoV94 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ricardoV94 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jessegrabowski commented Mar 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardoV94 Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ricardoV94 left a comment •

edited

Loading

ricardoV94 commented Mar 8, 2026 •

edited

Loading

ricardoV94 Apr 7, 2026 •

edited

Loading