Implement options to handle IDENTITY CIDs gracefully#239
Conversation
f1622e4 to
9f9f687
Compare
| // Assert the id is indexed. | ||
| r, err := carv2.OpenReader(path) | ||
| require.NoError(t, err) | ||
| t.Cleanup(func() { require.NoError(t, r.Close()) }) |
There was a problem hiding this comment.
no need to check read-only close errors 😉
| records = append(records, index.Record{Cid: c, Offset: uint64(sectionOffset)}) | ||
|
|
||
| if c.Prefix().MhType != multihash.IDENTITY || o.IncludeIdentityCIDs { | ||
| if uint64(cidLen) > o.MaxIndexCidSize { |
There was a problem hiding this comment.
coverage of this case might be nice - it could even fit in one of the existing tests here without needing a special test just for it
There was a problem hiding this comment.
There is a test for large CIDs already in this PR here. Do you think it's missing something?
There was a problem hiding this comment.
yeah, it's not using the default (common) path but setting an unrealistic max to break, that was my thought
but this is a nice to have, not a demand for busy-work if you're comfortable with this as is
|
The maximum of 2K is somewhat arbitrary isn't it? Not disagreeing with it, it feels arbitrary no matter how you slice it, but if we have a rationale then that might be good to include in a note. By locking it in a default here as 2K we're probably going to be pushing that fairly broadly through the stack if the CARv2 writer interface becomes the default for creating CARs. Which is probably good we have a proper limiting condition, just with possible ramifications. I might have to consider getting something similar in to the JavaScript CAR because we now have a lot of user data being generated that way, it's possible we get large identity CIDs slipping in there and only being caught if/when they hit this code in Filecoin. |
It is mostly arbitrary indeed; the rationale was to pick something that is somewhat future proof and unlikely to be hit in reasonable use cases today. Any suggestions for a better value is more than welcomed. I know @aschmahmann pointed out a limit of Thoughts? |
bdfd5f3 to
92e3c89
Compare
92e3c89 to
023c842
Compare
ribasushi
left a comment
There was a problem hiding this comment.
Solid wok overall, left a number of comments regarding naming/clarifications
None of the remarks are showstoppers, take them or leave them: it's good either way.
Implement two additional options that allow a CARv2 file to 1) include IDENTNTIY CIDs, and 2) specify a maximum allowed CID length with default of 2KiB as a sufficiently large default. Configure ReadWrite blockstore to persist given blocks with IDENTITY CIDs. Introduce a new Characteristics filed that signalls whether an index in a CAR file contains a full catalog of CIDs for backward compatibility purposes. Note, this is a new addition and will need to be added to the spec in a separate PR. Relates to #215
023c842 to
29405e4
Compare
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Update CARv2 specification to include a format for MultihashIndexSorted and fully-indexed characteristic bitfield. Relates to: - ipld/go-car#239 - ipld/go-car#217
Implement two additional options that allow a CARv2 file to 1) include
IDENTNTIY CIDs, and 2) specify a maximum allowed CID length with default
of 2KiB as a sufficiently large default.
Configure ReadWrite blockstore to persist given blocks with IDENTITY
CIDs.
Introduce a new Characteristics filed that signalls whether an index in
a CAR file contains a full catalog of CIDs for backward compatibility
purposes. Note, this is a new addition and will need to be added to the
spec in a separate PR.
Relates to #215