feat: pytket EncodedCircuit struct that keeps links back to the Hugr#1164
feat: pytket EncodedCircuit struct that keeps links back to the Hugr#1164
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1164 +/- ##
==========================================
- Coverage 79.60% 79.46% -0.14%
==========================================
Files 152 155 +3
Lines 19002 19293 +291
Branches 17912 18203 +291
==========================================
+ Hits 15126 15332 +206
- Misses 2991 3063 +72
- Partials 885 898 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
acl-cqc
left a comment
There was a problem hiding this comment.
Thanks Agustin - mostly good but I always find myself wanting to see how the structure is used before I can say whether it looks like the right tool for the job. So a few thoughts here but maybe need to see the creation, destruction, wiring code to say anything more.
I'm still of two minds about the subgraph-hash, but I'd support either list-index and passing around &mut, or min-node, rather than actual "Hash".
| /// Identifier for a hyper edge in the Hugr, encoded as a 64-bit hash that is | ||
| /// independent of the original Hugr representation. | ||
| /// | ||
| /// These are used to identify edges in the [`UnsupportedSubgraphPayload`] |
There was a problem hiding this comment.
Surely this is only needed for hugr fragments passed inline, rather than references to a subgraph stored outside the pytket circuit? (The latter still has all its edges in the original Hugr)
There was a problem hiding this comment.
We could technically retrieve that from the existing subgraphs when connecting two external references, but it's simpler to avoid the special case and store the extra ID.
That way we don't fail on mixed external/standalone payload types, and the structure of the Hugr is not lost if we can't resolve the external reference.
| /// See [`UnsupportedSubgraphPayload::External`]. | ||
| pub const OPGROUP_EXTERNAL_UNSUPPORTED_HUGR: &str = "EXTERNAL_UNSUPPORTED_HUGR"; | ||
|
|
||
| /// Identifier for a hyper edge in the Hugr, encoded as a 64-bit hash that is |
There was a problem hiding this comment.
hyper edge?? Isn't that an edge with >2 endpoints? (Or would that be "hyperedge"?)
I hope there is some non-hyp(er)ed term that might be used instead ;-)
There was a problem hiding this comment.
I'm borrowing the terminology from hugr-model. These are hyper edges where exactly one endpoint is an outgoing port.
It may be simpler if I go back to "Identifier for a wire in the Hugr".
| pub const OPGROUP_EXTERNAL_UNSUPPORTED_HUGR: &str = "EXTERNAL_UNSUPPORTED_HUGR"; | ||
|
|
||
| /// Identifier for a hyper edge in the Hugr, encoded as a 64-bit hash that is | ||
| /// independent of the original Hugr representation. |
There was a problem hiding this comment.
Not sure what this independence is??
| impl EncodedEdgeID { | ||
| /// Create a new subgraph hyper edge ID by hashing a Hugr wire. | ||
| pub fn new<N: HugrNode>(wire: Wire<N>) -> Self { | ||
| let hash = fxhash::hash64(&wire); |
There was a problem hiding this comment.
If you're hashing a usize node-id and a port number down to 64 bits, this is lossy / has collisions if the hugr has more than, oh, roughly 2>62 nodes. Which seems a rare case, and you might get away with the hash even then given most edges never get hashed, but I wonder about fixing #bits for node-id/portnum and panicking if the nodeid is out of range?
There was a problem hiding this comment.
The annoying part of storing the wire is the <N> parameter.
-
Nin aHugris just au32. (We can't have$2^{62}$ nodes using the default representation!) -
Nfor aPersistentHugris(CommitId, hugr::Node)~=((u32, u32), u32)
We need to encode an ID that's not dependent on N (because we always decode into a Hugr), a hash seems like the simplest thing that avoids us having to thread a shared counter thought the encoder context.
We could do that instead if you think it's needed.
| /// the connections that are not encoded in the pytket circuit. | ||
| /// | ||
| /// The types can also be inferred from the encoded hugr or linked | ||
| /// subcircuit, but we store them here to be robust. |
There was a problem hiding this comment.
or to ease human inspection and debugging
| let mut payload: UnsupportedSubgraphPayload = serde_json::from_str(&payload).map_err(|e| | ||
| PytketEncodeError::custom(format!("Barrier operation with opgroup {OPGROUP_EXTERNAL_UNSUPPORTED_HUGR} has corrupt data payload: {e}")) | ||
| )?; | ||
| let UnsupportedSubgraphPayloadType::External { id: subgraph_id } = payload.typ else { |
There was a problem hiding this comment.
Yeah, so you are encoding the external/standalone-ness of this in two places: both in whether it's OPGROUP_{EXTERNAL/STANDALONE}_UNSUPPORTED_HUGR, and in the variant of UnsupportedSubgraphPayloadType. Do you really want that redundancy? It does seem like an opportunity for things to go wrong...(otherwise you could combine the two OPGROUPs, which would be more opaque but simplify some code, or use two different structs, which....might make sense)
There was a problem hiding this comment.
Yeah, it was meant to ease validation (e.g. see if all opaque barriers are standalone without decoding the payload and allocating a String for the envelope) but that may be doable with serde_json 🤔
I'll merge the opgroups into one.
| /// # Errors | ||
| /// | ||
| /// Returns an error if a barrier operation with the [`OPGROUP_EXTERNAL_UNSUPPORTED_HUGR`] opgroup has an invalid payload. | ||
| pub(super) fn replace_external_with_standalone( |
There was a problem hiding this comment.
So fn replace_standalone_with_external returning or mutating an UnsupportedSubgraphs would also be an option, right....
There was a problem hiding this comment.
Could be, though not sure when would it be used?
| } | ||
| // TODO: Extract standalone unsupported hugr subgraphs. | ||
| // | ||
| // For now we keep the old behaviour of producing opaque TKET1.tk1op operations. |
There was a problem hiding this comment.
Well, ok. I'd be tempted to mark this unimplemented! too....
There was a problem hiding this comment.
This keeps things working as they were if we merge the PR into main.
I'd rather not panic on previously running code
| // Recursively encode the sub-graph. | ||
| let mut subencoder = PytketEncoderContext::new(circ, function, config)?; | ||
| subencoder.function_cache = self.function_cache.clone(); | ||
| subencoder.run_encoder(circ, function)?; |
There was a problem hiding this comment.
Seems plausible to pass &mut UnsupportedSubgraphs in here?
There was a problem hiding this comment.
See See #1164 (comment). This leaves the door open to using rayon in the future, at no extra cost.
| pub(super) typ: UnsupportedSubgraphPayloadType, | ||
| /// Input types of the subgraph. | ||
| /// | ||
| /// Each input is assigned a unique edge identifier, so we can reconstruct |
There was a problem hiding this comment.
Still not quite sure I get this. This structure is inside the barrier, but there's nothing else in the barrier that records which incoming edge goes where (or anything like that), right?
So for standalone circuits...the idea is that two unsupported-subgraphs will have the same edgeid (one as input, one as output) meaning you should connect the nth output of one UnsupSubg to the mth input of another?
But for external subgraphs, surely the Hugr (of which the UnsupSubg is part) has all this info too?
There was a problem hiding this comment.
So could you solve the naming problem too ;-) and go with
pub enum UnsupportedSubgraphPayload {
External { id: SubgraphId },
Inline {
inputs: Vec<(Type, EncodedEdgeId)>,
outputs: Vec<(Type, EncodedEdgeId)>,
hugr_envelope: String
}
}
I guess you're gonna raise the counter-example that you might want to mix External and Inline within the same EncodedCircuit - yes this structure allows that, but do we really want to? - in which case switching EncodedEdgeId to actual Wire would handle that without the hashing?
There was a problem hiding this comment.
See #1164 (comment), #1164 (comment).
Keeping things homogeneous makes for simpler code.
|
I think I addressed all your comments, except for
Would like your input on my comments there. |
|
Closing this, and creating a single PR instead. |
…1211) For previous reviews, see - #1164 - #1182 Comments from those have ben addressed already. This PR adds an `EncoderCircuit` in between the encoding/decoding process `Hugr <-> tket_json_rs::SerialCircuit`. This new instruction carries multiple `SerialCircuits` associated with the Hugr region they encode. This lets us replace the old regions with modified ones in the original hugr, after optimising the SerialCircuits. The opaque barriers that represent sections of the original hugr not encodable as pytket commands now have two payload variants: - `OpaqueSubgraphPayload::Inline` as before encodes the unsupported hugr envelope, plus some boundary edge ids that we can use to reconstruct links between different inline payloads. This variant does not support all circuits. Subgraphs must be flat and contain no non-local edges (just as before, but now we detect and error out on those cases). - `OpaqueSubgraphPayload::External`. This references a subgraph in the original hugr with a simple ID. This allows us to re-use the nodes from the initial hugr, keeping all the hierarchy and edges. The second case only works for `Hugr`s due to `HugrBuilder` limitations, but it's the usecase we'll use for tket1 optimisation. The first one is kept for cases where we need to extract standalone pytket circuits. For testing, this includes - Several new roundtrip tests in `serialize::pytket::tests`, including error tests - An integration test in `tket1-passes` that encodes a flat hugr, applies a pass from pytket, and reassembles it inline. BREAKING CHANGE: `OpConvertError` renamed to `PytketEncodeOpError` BREAKING CHANGE: Some minor changes to `PytketDecodeError` variant attributes. BREAKING CHANGE: `fn_name` moved from `DecodeOptions` to `DecodeInsertionTarget::Function`
Adds a
EncodedCircuitto keep track of encoded pytket circuits with external links to subgraphs in the original Hugr.Before we encoded hugr envelopes directly in pytket barrier metadata to represent unsupported subgraphs. Now we define an enum
UnsupportedSubgraphPayloadthat can be either the standalone envelope, or aSubgraphIdindex into the subgraphs carried byEncodedCircuit.The actual subgraph tracking is split into a
UnsupportedSubgraphs(inunsupported.rs) so we can pass it around in the encoder.With this definition we are now also able to encode multiple subcircuits from a single Hugr simultaneously , track them in the same structure, and optimize them in parallel. This is controlled by a new
EncodeOptions::encode_subcircuitsflag.This change effectively adds an extra step in the Hugr->pytket conversion, the previous encoding code now creates an
EncodedCircuitand immediately callsextract_standaloneto replace all subgraph pointers with envelope payloads and returns the top-level circuit.Tests coverage is partial, we're still missing the decoder implementation to glue everything back together.