Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2839 +/- ##
==========================================
+ Coverage 83.69% 83.80% +0.11%
==========================================
Files 267 267
Lines 53472 53596 +124
Branches 47598 47722 +124
==========================================
+ Hits 44754 44917 +163
+ Misses 6310 6272 -38
+ Partials 2408 2407 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
That's spending 1.1s doing quicksort??? |
Terrifying. I don't understand why it should have this effect. Is it because I |
| } else if a > b { | ||
| Some(Ordering::Greater) | ||
| } else { | ||
| match format!("{:?}", self).cmp(&format!("{:?}", other)) { |
There was a problem hiding this comment.
Looks like most of the time is spend in this format! call, but not sure how it ended up being called this much. (String formatting is expensive).
A quick log(n) speedup would be using sorted_by_key to avoid recomputing everything on each step instead of implementing PartialOrd.
There was a problem hiding this comment.
The sorting should happen once per finish_hugr() call, because that does validation which involves the round-trip comparison, which would be expensive if the hugr contains a large number of indistinguishable ops in the same region, so I thought if I did this (skipping validation) it would solve the problem, and it did seem to make a difference when I ran cargo bench locally, but the results on the PR didn't change much.
There was a problem hiding this comment.
if it's doing this sort in the roundtrip that happens during validation (only) when HUGR_TEST_SCHEMA is set, that shouldn't affect benchmarks, should it?
There was a problem hiding this comment.
Yeah, my initial diagnosis was wrong. The benchmarks were bad because it was doing it during JSON serialization.
| let const_parent = self.get_parent(node).unwrap(); | ||
| let outport = self.node_outputs(node).exactly_one().ok().unwrap(); | ||
| let succ_nodeports = self.all_linked_inputs(node).collect_vec(); | ||
| for (succ_node, succ_inport) in succ_nodeports { |
There was a problem hiding this comment.
How about putting the Const at the LCA of all successors (or removing if there are no successors, OK)?
There was a problem hiding this comment.
I did it this way because this is what the model loader does already (inserts a Const node in the same region as the LoadConst). LCA could work too, but would be a bit fiddlier.
| } | ||
|
|
||
| /// Whether the node's children form either a dataflow sibling graph or a | ||
| /// control-flow sibling graph. (For these nodes the positions of the first |
There was a problem hiding this comment.
What other nodetype might a non-module-root ancestor of a LoadConstant be? (Sorry if there's an obvious answer!)
There was a problem hiding this comment.
Ok, sorry, this is for sorting below, not for placing Const nodes above.
"Is not module_root" or "OpType is not module" might be all you need though
There was a problem hiding this comment.
Yes true. But I think I prefer to have the test this way round, with an explicit function that says whether the first two children have semantic significance. (Maybe secretly I hope this won't be necessary one day, as it would be simpler to think of children as always unordered.)
hugr-core/src/hugr.rs
Outdated
| } else { | ||
| children.sort_by(sort_fn); | ||
| } | ||
| for child in children { |
There was a problem hiding this comment.
queue.extend(children) probably works here
acl-cqc
left a comment
There was a problem hiding this comment.
Good stuff Alec, thanks. Somewhat painful I admit, but then, we are trying to avoid a graph-isomorphism problem after all, which would be very painful indeed....!!
|
I finally twigged that |
acl-cqc
left a comment
There was a problem hiding this comment.
Hmmm, still not quite sure we've got to the bottom of this yet, but a few thoughts
| // TODO: Currently fails when using `hugr-model` | ||
| //crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary()); | ||
| crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::text()); | ||
| crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary()); |
There was a problem hiding this comment.
any reason not to check EnvelopeConfig::text as well?
There was a problem hiding this comment.
No reason except that the JSON format will be deprecated soon.
There was a problem hiding this comment.
Maybe add a TODO to add ModelText once that's stable?
hugr-core/src/hugr.rs
Outdated
| } | ||
| }; | ||
| if self.contains_dsg_or_csg(node) { | ||
| children[2..].sort_by(sort_fn); |
hugr-core/src/hugr.rs
Outdated
| queue.push_back(child); | ||
| let mut children = self.children(node).collect_vec(); | ||
| let sort_fn = |a: &Node, b: &Node| { | ||
| let n_a_inp = self.input_neighbours(*a).count(); |
There was a problem hiding this comment.
I am not sure this ordering by number of predecessors/successors adds very much, but no objection.
If we wanted to be more canonical...an algorithm might be to do a topological sort, with our own choice among the legal candidates at each step - our choice being the minimum by a lexicographic ordering of each input's (node-number of source node, output index), followed by optype. (The topsort ensures the source node has been given a number before any consumer of its outputs.)
I think that's more work though so probably not worth it.
| let mut node_children: HashMap<Node, Vec<Node>> = HashMap::default(); | ||
| for node in self.nodes() { | ||
| let mut children = self.children(node).collect_vec(); | ||
| if self.contains_dsg_or_csg(node) { |
There was a problem hiding this comment.
This is similar to the order used in the closure above, but without the comparison of number of inputs / number of outputs. Are they intended to be different? If you want the #inputs/outputs comparison in both then it's probably worth commoning that up?
There was a problem hiding this comment.
And also without the comparison of optypes for the case where the numbers of inputs and outputs are equal. This is more of a lightweight final step. I agree there is a little bit of boilerplate though.
| // This step is not strictly necessary. | ||
| self.graph.compact_nodes(|_, _| {}); | ||
|
|
||
| self.order_siblings_by_node_index(); |
There was a problem hiding this comment.
It does seem odd to call this here having already used canonical_order earlier in fn canonicalize_nodes?
There was a problem hiding this comment.
I'm now struggling to understand why this was necessary! May have to come back to it... It certainly is necessary in the sense that tests fail without it...
There was a problem hiding this comment.
Coming back to this: this is to handle the case where, in the initial hugr, we have siblings A (say a LoadConstant) and B (say the preceding Const) where A happens to comes before B in the list of children, and A has a lower node index, but B comes before A in the canonical ordering based on OpType.
The first ordering will then relabel the node indices, so that B has a lower node index than A. But it won't change the ordering in the list of children. To canonicalize this we have to reorder this list by node index. This way we get the same end result versus starting with a HUGR where the initial node indices of A and B were swapped and their positions in the list of children were also swapped.
| } else if a > b { | ||
| Some(Ordering::Greater) | ||
| } else { | ||
| match format!("{:?}", self).cmp(&format!("{:?}", other)) { |
There was a problem hiding this comment.
if it's doing this sort in the roundtrip that happens during validation (only) when HUGR_TEST_SCHEMA is set, that shouldn't affect benchmarks, should it?
|
I did the "icky" thing, which I actually think is not so icky, of reverting to the previous lightweight sorting when serializing to JSON. Now the benchmarks are good. |
| // TODO: Currently fails when using `hugr-model` | ||
| //crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary()); | ||
| crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::text()); | ||
| crate::envelope::test::check_hugr_roundtrip(&hugr, EnvelopeConfig::binary()); |
There was a problem hiding this comment.
Maybe add a TODO to add ModelText once that's stable?
| continue; | ||
| } | ||
| self.disconnect_edge(node, outport, succ_node, succ_inport); | ||
| let new_node = self.add_node_before(succ_node, self.get_optype(node).clone()); |
There was a problem hiding this comment.
nit: If multiple nodes in the same region use the constant we'll still generate multiple const nodes here.
Maybe we could collect a BTreeMap<succ_parent: Node, succ_nodes: Vec<(Node, IncomingPort)>> first?
Not sure if it's worth the extra complexity.
There was a problem hiding this comment.
Yes this is true, but multiple Const nodes is exactly what we get with model deserialization anyway. Probably worth making a pass to consolidate these -- but I think too complex to do here.
hugr-core/src/hugr.rs
Outdated
| matches!( | ||
| self.get_optype(node), | ||
| OpType::FuncDefn(_) | ||
| | OpType::DFG(_) | ||
| | OpType::DataflowBlock(_) | ||
| | OpType::TailLoop(_) | ||
| | OpType::CFG(_) | ||
| | OpType::Case(_) | ||
| ) |
There was a problem hiding this comment.
This could be done using tags, if we want to avoid enumerating all cases.
let tag = self.get_optype(node).tag();
tag <= OpTag::DataflowParent || tag <= OpTag::Cfg
hugr-core/src/hugr.rs
Outdated
| /// | ||
| /// Used by [`HugrMut::canonicalize_nodes`] and the serialization code. | ||
| fn canonical_order(&self, root: Node) -> impl Iterator<Item = Node> + '_ { | ||
| fn canonical_order(&self, root: Node, try_hard: bool) -> impl Iterator<Item = Node> + '_ { |
There was a problem hiding this comment.
It'd be cleaner to define an enum rather than use opaque bool parameters, something like
/// ...
#[derive(Clone, Copy, Debug)]
enum CanonicalOrderStrat {
/// ...
TryHard,
/// ...
Quick,
}There was a problem hiding this comment.
Good suggestion, will do.
## 🤖 New release * `hugr-model`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-core`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-llvm`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-passes`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-cli`: 0.25.5 -> 0.25.6 (✓ API compatible changes) * `hugr-persistent`: 0.4.5 -> 0.4.6 <details><summary><i><b>Changelog</b></i></summary><p> ## `hugr-model` <blockquote> ## [0.25.6](hugr-model-v0.25.5...hugr-model-v0.25.6) - 2026-02-20 ### New Features - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-core` <blockquote> ## [0.25.6](hugr-core-v0.25.5...hugr-core-v0.25.6) - 2026-02-20 ### Bug Fixes - Canonicalize more ([#2839](#2839)) - used_extensions should include transitive requirements ([#2891](#2891)) ### New Features - Add s expression format to envelope formats ([#2864](#2864)) - added hash.rs, updated imports ([#2840](#2840)) - *(hugr-py)* Define typed Metadata protocol ([#2765](#2765)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-llvm` <blockquote> ## [0.25.6](hugr-llvm-v0.25.5...hugr-llvm-v0.25.6) - 2026-02-20 ### New Features - Add error context when lowering hugrs to LLVM ([#2869](#2869)) </blockquote> ## `hugr-passes` <blockquote> ## [0.25.6](hugr-passes-v0.25.5...hugr-passes-v0.25.6) - 2026-02-20 ### Bug Fixes - Panic on UntuplePass when nodes had order edges ([#2883](#2883)) ### New Features - added hash.rs, updated imports ([#2840](#2840)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) </blockquote> ## `hugr` <blockquote> ## [0.25.6](hugr-v0.25.5...hugr-v0.25.6) - 2026-02-20 ### Bug Fixes - Panic on UntuplePass when nodes had order edges ([#2883](#2883)) - Canonicalize more ([#2839](#2839)) - used_extensions should include transitive requirements ([#2891](#2891)) ### New Features - Add s expression format to envelope formats ([#2864](#2864)) - *(hugr-py)* Define typed Metadata protocol ([#2765](#2765)) - Add a `NodeTemplate::call_to_function` helper ([#2878](#2878)) - added hash.rs, updated imports ([#2840](#2840)) - Remove size limitation for binary envelopes ([#2880](#2880)) </blockquote> ## `hugr-cli` <blockquote> ## [0.25.6](hugr-cli-v0.25.5...hugr-cli-v0.25.6) - 2026-02-20 ### New Features - Add s expression format to envelope formats ([#2864](#2864)) </blockquote> ## `hugr-persistent` <blockquote> ## [0.4.0](hugr-persistent-v0.3.4...hugr-persistent-v0.4.0) - 2025-12-22 ### New Features - [**breaking**] Remove `RootCheckable` ([#2704](#2704)) - [**breaking**] Bump MSRV to Rust 1.89 ([#2747](#2747)) - [**breaking**] Type-safe access for node metadata ([#2755](#2755)) ### Refactor - [**breaking**] Remove multiple deprecated definitions ([#2751](#2751)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/).
This allows us to use the binary format instead of the deprecated JSON format in the round-trip equality testing used for HUGR validation.
I've marked one test (
simple_alias) as#[ignore]. Aliases are not properly supported by the model, are not currently used as far as I know, and will probably be removed or at least reworked soon. (See #2828 .)Some of the benchmark results are a little alarming. It looks like the regressions are mainly in a sorting method being called during JSON serialization.
Closes #2789 .