fix: Canonicalize more by cqc-alec · Pull Request #2839 · Quantinuum/hugr

cqc-alec · 2026-01-27T14:26:00Z

This allows us to use the binary format instead of the deprecated JSON format in the round-trip equality testing used for HUGR validation.

I've marked one test (simple_alias) as #[ignore]. Aliases are not properly supported by the model, are not currently used as far as I know, and will probably be removed or at least reworked soon. (See #2828 .)

Some of the benchmark results are a little alarming. It looks like the regressions are mainly in a sorting method being called during JSON serialization.

Closes #2789 .

codecov · 2026-01-27T14:30:14Z

Codecov Report

❌ Patch coverage is 95.55556% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.80%. Comparing base (46db58a) to head (aa373f6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
hugr-core/src/ops.rs	83.78%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2839      +/-   ##
==========================================
+ Coverage   83.69%   83.80%   +0.11%     
==========================================
  Files         267      267              
  Lines       53472    53596     +124     
  Branches    47598    47722     +124     
==========================================
+ Hits        44754    44917     +163     
+ Misses       6310     6272      -38     
+ Partials     2408     2407       -1

Flag	Coverage Δ
python	`88.47% <ø> (ø)`
rust	`83.23% <95.55%> (+0.12%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codspeed-hq · 2026-01-27T14:32:03Z

Merging this PR will not alter performance

✅ 28 untouched benchmarks

_{Comparing ae/roundtrip.1 (aa373f6) with main (46db58a)}

aborgna-q · 2026-01-28T14:49:04Z

❌ circuit_serialize/json[1000] 29.4 ms -> 1,108.4 ms

That's spending 1.1s doing quicksort???

cqc-alec · 2026-01-28T15:19:10Z

❌ circuit_serialize/json[1000] 29.4 ms -> 1,108.4 ms

That's spending 1.1s doing quicksort???

Terrifying. I don't understand why it should have this effect. Is it because I impled PartialOrd for OpType?

aborgna-q · 2026-01-28T15:34:28Z

hugr-core/src/ops.rs

+        } else if a > b {
+            Some(Ordering::Greater)
+        } else {
+            match format!("{:?}", self).cmp(&format!("{:?}", other)) {


Looks like most of the time is spend in this format! call, but not sure how it ended up being called this much. (String formatting is expensive).

A quick log(n) speedup would be using sorted_by_key to avoid recomputing everything on each step instead of implementing PartialOrd.

The sorting should happen once per finish_hugr() call, because that does validation which involves the round-trip comparison, which would be expensive if the hugr contains a large number of indistinguishable ops in the same region, so I thought if I did this (skipping validation) it would solve the problem, and it did seem to make a difference when I ran cargo bench locally, but the results on the PR didn't change much.

if it's doing this sort in the roundtrip that happens during validation (only) when HUGR_TEST_SCHEMA is set, that shouldn't affect benchmarks, should it?

Yeah, my initial diagnosis was wrong. The benchmarks were bad because it was doing it during JSON serialization.

This reverts commit 08f283e.

This reverts commit ec58b23.

acl-cqc · 2026-01-30T12:27:44Z

hugr-core/src/hugr.rs

+            let const_parent = self.get_parent(node).unwrap();
+            let outport = self.node_outputs(node).exactly_one().ok().unwrap();
+            let succ_nodeports = self.all_linked_inputs(node).collect_vec();
+            for (succ_node, succ_inport) in succ_nodeports {


How about putting the Const at the LCA of all successors (or removing if there are no successors, OK)?

I did it this way because this is what the model loader does already (inserts a Const node in the same region as the LoadConst). LCA could work too, but would be a bit fiddlier.

acl-cqc · 2026-01-30T12:29:00Z

hugr-core/src/hugr.rs

+    }
+
+    /// Whether the node's children form either a dataflow sibling graph or a
+    /// control-flow sibling graph. (For these nodes the positions of the first


What other nodetype might a non-module-root ancestor of a LoadConstant be? (Sorry if there's an obvious answer!)

Ok, sorry, this is for sorting below, not for placing Const nodes above.

"Is not module_root" or "OpType is not module" might be all you need though

Yes true. But I think I prefer to have the test this way round, with an explicit function that says whether the first two children have semantic significance. (Maybe secretly I hope this won't be necessary one day, as it would be simpler to think of children as always unordered.)

acl-cqc · 2026-01-30T12:29:41Z

hugr-core/src/hugr.rs

+            } else {
+                children.sort_by(sort_fn);
+            }
+            for child in children {


queue.extend(children) probably works here

acl-cqc

Good stuff Alec, thanks. Somewhat painful I admit, but then, we are trying to avoid a graph-isomorphism problem after all, which would be very painful indeed....!!

cqc-alec · 2026-02-09T14:43:21Z

I finally twigged that canonical_order() is called not only by the verification code but also by the JSON serialization code. That explains why all the JSON serialization benchmarks got a lot slower. (Confirmed by looking at flame graph of bench executable.) Since we are planning to deprecate JSON (that is the motivation for this in the first place), I'd argue we can accept this performance hit. @acl-cqc would you agree?

acl-cqc

Hmmm, still not quite sure we've got to the bottom of this yet, but a few thoughts

acl-cqc · 2026-02-12T15:12:03Z

hugr-core/src/hugr/validate.rs

-            // TODO: Currently fails when using `hugr-model`
-            //crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary());
-            crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::text());
+            crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary());


any reason not to check EnvelopeConfig::text as well?

No reason except that the JSON format will be deprecated soon.

Maybe add a TODO to add ModelText once that's stable?

acl-cqc · 2026-02-12T15:14:47Z

hugr-core/src/hugr.rs

+                }
+            };
+            if self.contains_dsg_or_csg(node) {
+                children[2..].sort_by(sort_fn);


acl-cqc · 2026-02-12T15:19:07Z

hugr-core/src/hugr.rs

-                queue.push_back(child);
+            let mut children = self.children(node).collect_vec();
+            let sort_fn = |a: &Node, b: &Node| {
+                let n_a_inp = self.input_neighbours(*a).count();


I am not sure this ordering by number of predecessors/successors adds very much, but no objection.

If we wanted to be more canonical...an algorithm might be to do a topological sort, with our own choice among the legal candidates at each step - our choice being the minimum by a lexicographic ordering of each input's (node-number of source node, output index), followed by optype. (The topsort ensures the source node has been given a number before any consumer of its outputs.)

I think that's more work though so probably not worth it.

acl-cqc · 2026-02-12T15:22:50Z

hugr-core/src/hugr.rs

+        let mut node_children: HashMap<Node, Vec<Node>> = HashMap::default();
+        for node in self.nodes() {
+            let mut children = self.children(node).collect_vec();
+            if self.contains_dsg_or_csg(node) {


This is similar to the order used in the closure above, but without the comparison of number of inputs / number of outputs. Are they intended to be different? If you want the #inputs/outputs comparison in both then it's probably worth commoning that up?

And also without the comparison of optypes for the case where the numbers of inputs and outputs are equal. This is more of a lightweight final step. I agree there is a little bit of boilerplate though.

acl-cqc · 2026-02-12T15:23:54Z

hugr-core/src/hugr.rs

        // This step is not strictly necessary.
        self.graph.compact_nodes(|_, _| {});
+
+        self.order_siblings_by_node_index();


It does seem odd to call this here having already used canonical_order earlier in fn canonicalize_nodes?

I'm now struggling to understand why this was necessary! May have to come back to it... It certainly is necessary in the sense that tests fail without it...

Coming back to this: this is to handle the case where, in the initial hugr, we have siblings A (say a LoadConstant) and B (say the preceding Const) where A happens to comes before B in the list of children, and A has a lower node index, but B comes before A in the canonical ordering based on OpType.

The first ordering will then relabel the node indices, so that B has a lower node index than A. But it won't change the ordering in the list of children. To canonicalize this we have to reorder this list by node index. This way we get the same end result versus starting with a HUGR where the initial node indices of A and B were swapped and their positions in the list of children were also swapped.

acl-cqc · 2026-02-12T15:25:36Z

hugr-core/src/ops.rs

+        } else if a > b {
+            Some(Ordering::Greater)
+        } else {
+            match format!("{:?}", self).cmp(&format!("{:?}", other)) {


if it's doing this sort in the roundtrip that happens during validation (only) when HUGR_TEST_SCHEMA is set, that shouldn't affect benchmarks, should it?

cqc-alec · 2026-02-13T13:43:19Z

I did the "icky" thing, which I actually think is not so icky, of reverting to the previous lightweight sorting when serializing to JSON. Now the benchmarks are good.

aborgna-q · 2026-02-13T15:11:09Z

hugr-core/src/hugr/validate.rs

-            // TODO: Currently fails when using `hugr-model`
-            //crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary());
-            crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::text());
+            crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary());


Maybe add a TODO to add ModelText once that's stable?

aborgna-q · 2026-02-13T15:18:03Z

hugr-core/src/hugr.rs

+                    continue;
+                }
+                self.disconnect_edge(node, outport, succ_node, succ_inport);
+                let new_node = self.add_node_before(succ_node, self.get_optype(node).clone());


nit: If multiple nodes in the same region use the constant we'll still generate multiple const nodes here.

Maybe we could collect a BTreeMap<succ_parent: Node, succ_nodes: Vec<(Node, IncomingPort)>> first?

Not sure if it's worth the extra complexity.

Yes this is true, but multiple Const nodes is exactly what we get with model deserialization anyway. Probably worth making a pass to consolidate these -- but I think too complex to do here.

aborgna-q · 2026-02-13T15:24:49Z

hugr-core/src/hugr.rs

+        matches!(
+            self.get_optype(node),
+            OpType::FuncDefn(_)
+                | OpType::DFG(_)
+                | OpType::DataflowBlock(_)
+                | OpType::TailLoop(_)
+                | OpType::CFG(_)
+                | OpType::Case(_)
+        )


This could be done using tags, if we want to avoid enumerating all cases.

let tag = self.get_optype(node).tag(); tag <= OpTag::DataflowParent || tag <= OpTag::Cfg

Nice, thanks.

aborgna-q · 2026-02-13T15:28:48Z

hugr-core/src/hugr.rs

    ///
    /// Used by [`HugrMut::canonicalize_nodes`] and the serialization code.
-    fn canonical_order(&self, root: Node) -> impl Iterator<Item = Node> + '_ {
+    fn canonical_order(&self, root: Node, try_hard: bool) -> impl Iterator<Item = Node> + '_ {


It'd be cleaner to define an enum rather than use opaque bool parameters, something like

/// ... #[derive(Clone, Copy, Debug)] enum CanonicalOrderStrat { /// ... TryHard, /// ... Quick, }

Good suggestion, will do.

## 🤖 New release * `hugr-model`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-core`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-llvm`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-passes`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-cli`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-persistent`: 0.4.5 -> 0.4.6 <details><summary>Changelog</summary> ## `hugr-model` <blockquote> ## [0.25.6](hugr-model-v0.25.5...hugr-model-v0.25.6) - 2026-02-20 ### New Features - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-core` <blockquote> ## [0.25.6](hugr-core-v0.25.5...hugr-core-v0.25.6) - 2026-02-20 ### Bug Fixes - Canonicalize more ([#2839](#2839)) - used_extensions should include transitive requirements ([#2891](#2891)) ### New Features - Add s expression format to envelope formats ([#2864](#2864)) - added hash.rs, updated imports ([#2840](#2840)) - *(hugr-py)* Define typed Metadata protocol ([#2765](#2765)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-llvm` <blockquote> ## [0.25.6](hugr-llvm-v0.25.5...hugr-llvm-v0.25.6) - 2026-02-20 ### New Features - Add error context when lowering hugrs to LLVM ([#2869](#2869)) </blockquote> ## `hugr-passes` <blockquote> ## [0.25.6](hugr-passes-v0.25.5...hugr-passes-v0.25.6) - 2026-02-20 ### Bug Fixes - Panic on UntuplePass when nodes had order edges ([#2883](#2883)) ### New Features - added hash.rs, updated imports ([#2840](#2840)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) </blockquote> ## `hugr` <blockquote> ## [0.25.6](hugr-v0.25.5...hugr-v0.25.6) - 2026-02-20 ### Bug Fixes - Panic on UntuplePass when nodes had order edges ([#2883](#2883)) - Canonicalize more ([#2839](#2839)) - used_extensions should include transitive requirements ([#2891](#2891)) ### New Features - Add s expression format to envelope formats ([#2864](#2864)) - *(hugr-py)* Define typed Metadata protocol ([#2765](#2765)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) - added hash.rs, updated imports ([#2840](#2840)) - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-cli` <blockquote> ## [0.25.6](hugr-cli-v0.25.5...hugr-cli-v0.25.6) - 2026-02-20 ### New Features - Add s expression format to envelope formats ([#2864](#2864)) </blockquote> ## `hugr-persistent` <blockquote> ## [0.4.0](hugr-persistent-v0.3.4...hugr-persistent-v0.4.0) - 2025-12-22 ### New Features - [**breaking**] Remove `RootCheckable` ([#2704](#2704)) - [**breaking**] Bump MSRV to Rust 1.89 ([#2747](#2747)) - [**breaking**] Type-safe access for node metadata ([#2755](#2755)) ### Refactor - [**breaking**] Remove multiple deprecated definitions ([#2751](#2751)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/).

cqc-alec added 6 commits January 27, 2026 11:40

Try using EnvelopeConfig::binary().

0a8aa51

Implement a partial ordering of sibling nodes.

23c699d

Sort output neighbours when comparing.

b66ef4f

Fix test so that it makes a valid HUGR.

b0dbb34

Canonicalize Const nodes.

2b56684

Ignore test with Alias.

de275ad

cqc-alec changed the title ~~Canonicalize more~~ fix: Canonicalize more Jan 27, 2026

cqc-alec marked this pull request as ready for review January 27, 2026 14:48

cqc-alec requested a review from a team as a code owner January 27, 2026 14:48

cqc-alec requested a review from acl-cqc January 27, 2026 14:48

aborgna-q reviewed Jan 28, 2026

View reviewed changes

cqc-alec added 5 commits January 29, 2026 14:12

Merge branch 'main' into ae/roundtrip.1

815d986

Define partial order as a function rather than a trait.

08f283e

Revert "Define partial order as a function rather than a trait."

88e8129

This reverts commit 08f283e.

Skip validation when constructing example Hugrs for benchmarks.

ec58b23

Revert "Skip validation when constructing example Hugrs for benchmarks."

a3ae97a

This reverts commit ec58b23.

acl-cqc reviewed Jan 30, 2026

View reviewed changes

cqc-alec added 4 commits January 30, 2026 13:40

Merge branch 'main' into ae/roundtrip.1

b58730f

Simplify.

639ab00

Merge branch 'main' into ae/roundtrip.1

067ee81

Merge branch 'main' into ae/roundtrip.1

f0d3cc8

cqc-alec requested a review from acl-cqc February 9, 2026 14:43

acl-cqc reviewed Feb 12, 2026

View reviewed changes

cqc-alec added 2 commits February 13, 2026 11:59

Merge branch 'main' into ae/roundtrip.1

87514f3

Revert to lightweight sorting for JSON serialization.

028fe51

cqc-alec requested review from aborgna-q and acl-cqc February 13, 2026 13:43

aborgna-q reviewed Feb 13, 2026

View reviewed changes

cqc-alec added 3 commits February 13, 2026 15:36

Add TODO comment.

6db9ed3

Simplify.

cd0c5e6

Use an enum instead of a bool.

aa373f6

cqc-alec requested a review from aborgna-q February 13, 2026 16:05

aborgna-q approved these changes Feb 13, 2026

View reviewed changes

cqc-alec added this pull request to the merge queue Feb 16, 2026

Merged via the queue into main with commit 82c25ef Feb 16, 2026
30 checks passed

cqc-alec deleted the ae/roundtrip.1 branch February 16, 2026 08:58

hugrbot mentioned this pull request Feb 13, 2026

chore: release #2852

Merged

hugrbot mentioned this pull request Feb 27, 2026

chore: release #2909

Merged

Conversation

cqc-alec commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codspeed-hq bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

aborgna-q commented Jan 28, 2026

Uh oh!

cqc-alec commented Jan 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

acl-cqc left a comment

Choose a reason for hiding this comment

Uh oh!

cqc-alec commented Feb 9, 2026

Uh oh!

acl-cqc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cqc-alec commented Feb 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cqc-alec commented Jan 27, 2026 •

edited

Loading

codecov bot commented Jan 27, 2026 •

edited

Loading

codspeed-hq bot commented Jan 27, 2026 •

edited

Loading