feat: remove block decoding global registry#80
Conversation
| type BlockDecoder interface { | ||
| Register(codec uint64, decoder DecodeBlockFunc) | ||
| Decode(blocks.Block) (Node, error) | ||
| } | ||
| type safeBlockDecoder struct { | ||
| // Can be replaced with an RCU if necessary. | ||
| lock sync.RWMutex | ||
| decoders map[uint64]DecodeBlockFunc | ||
| } | ||
|
|
||
| // Register registers decoder for all blocks with the passed codec. | ||
| // | ||
| // This will silently replace any existing registered block decoders. | ||
| func (d *safeBlockDecoder) Register(codec uint64, decoder DecodeBlockFunc) { | ||
| d.lock.Lock() | ||
| defer d.lock.Unlock() | ||
| d.decoders[codec] = decoder | ||
| } | ||
|
|
||
| func (d *safeBlockDecoder) Decode(block blocks.Block) (Node, error) { |
There was a problem hiding this comment.
Any preferences on whether removing this is a good/bad idea? We could just expose this as a registry any application can implement (as in the go-ipld-legacy PR).
There was a problem hiding this comment.
@Jorropo @willscott @rvagg I'm thinking of not keeping the interface, but exposing a non-threadsafe registry here like exists in go-ipld-prime. This makes upgrading easier for people without having to consider implications like "should I use the go-ipld-format codec or the go-ipld-prime version of the same one". People can then opt into the go-ipld-prime codecs optionally/as needed (e.g. since there are more of them around these days).
|
|
||
| func (d *safeBlockDecoder) Decode(block blocks.Block) (Node, error) { | ||
| // Decode decodes the given block using the default BlockDecoder. | ||
| func Decode(block blocks.Block, decoder DecodeBlockFunc) (Node, error) { |
There was a problem hiding this comment.
I'm questioning the value of this function. If the caller has the decoder already why would it call it in here ? I would call the decoder directly.
There was a problem hiding this comment.
Yeah, I'm cool just deleting this given we're breaking (a very small number of) users here anyway.
There was a problem hiding this comment.
The only thing this does is shortcut if your block is already a node which is not necessarily an obvious optimization. This makes the modifications upstream a no-brainer one line thing (import the codec + pass it) rather than a few lines of copy-paste.
There was a problem hiding this comment.
If a consumer randomly handle mixed basic blocks and deserialized ones it sound like a them issue.
What about using strong types ? 🙃
I'm fine to keep this as-is if anyone incists.
There was a problem hiding this comment.
Being the person doing the bubbling (and having now gone and located a bunch of codebases to do this) I'm thinking let's just keep it and add a comment that it's a helper function. I'd like propagating updates here to be as mindless as possible, changing behavior doesn't make it mindless.
coding.go
Outdated
| } | ||
|
|
||
| func (d *safeBlockDecoder) Decode(block blocks.Block) (Node, error) { | ||
| // Decode decodes the given block using the default BlockDecoder. |
There was a problem hiding this comment.
Docs seems wrong, AFAIT the decoder is an argument passed in.
There was a problem hiding this comment.
yeah thanks that's the old docs, will update
4b5b349 to
c3da01c
Compare
|
sgtm, including removal of |
c3da01c to
b3d3a1e
Compare
|
2023-06-06 Kubo/Boxo maintainer conversation:
|
8766125 to
7719ef4
Compare
willscott
left a comment
There was a problem hiding this comment.
I'm fine with this as is.
- I'd vaguely prefer using
sync.Oncearound initialization, and a lock protecting concurrent map accesses
| d.lock.RLock() | ||
| decoder, ok := d.decoders[ty] | ||
| d.lock.RUnlock() | ||
| r.ensureInit() |
There was a problem hiding this comment.
is there a way to avoid doing this work on every usage?
There was a problem hiding this comment.
I just copied this from the go-ipld-prime one https://github.com/ipld/go-ipld-prime/blob/master/multicodec/registry.go. But yeah happy to change it either here, or in a patch release if this ends up being annoying for people.
There was a problem hiding this comment.
and a lock protecting concurrent map accesses
Same in that I copied from go-ipld-prime. The struct here used to be threadsafe and I'm fine with going back to that version too if preferred.
The global registry serves as a footgun for application developers given the subtle way (module initialization ordering) that the registry tends to be used. Historically this hasn't been so much of a problem because there's never been more than one implementation of a given codec that conforms to the go-ipld-format interfaces.
However, that's no longer the case due to their being two forks of the go-ipld-format version of the dag-pb codec (go-merkledag maintained by @rvagg and boxo's fork maintained by @ipfs/kubo-maintainers.
Related to ipfs/boxo#291 and ipfs/go-ipld-legacy#12
From what I can tell via some scanning on grep.app for ipld.Decode, and format.Decode as well as GitHub search the blast radius of this change appears fairly minimal and will help as either more codec implementations emerge or get consolidated.
Side note: this would've been a problem with https://github.com/ipld/go-codec-dagpb as well if someone had written a wrapper to make it compatible with go-ipld-format. However, instead it lives in a different global registry. The go-ipld-prime default registry is similarly at risk of being a problem, although at least users of the global registry are warned about its risks in the comments.
cc @willscott @rvagg @Jorropo