Revamp data loading and deprecate decode functions#5671
Merged
laurmaedje merged 2 commits intomainfrom Jan 9, 2025
Merged
Conversation
ab936bb to
3b74d4c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR refactors how files are loaded by the various path-taking functions. With these changes, all functions that support paths today support bytes in addition for full flexibility. The existing
.decodefunctions are deprecated effectively immediately.API Changes
image,cbor,csv,json,toml,xml, andyamlnow support a path string or bytes and their.decodevariants are deprecated.pdf.embedalways take a path (since it's needed for the PDF either way) and optionally bytes (in which case the path will not be read from). Its.decodevariant is removed without deprecation since it wasn't released yet.plugin,bibliography,bibliography.style,cite.style,raw.theme, andraw.syntaxesnow accept bytes in addition to a path string (some also accept an array of any mix of the two). These did not have a.decodevariant, so this adds new flexibility.Notes
csv.decodeacceptedstr | bytesas data, the new way with justcsvwill always interpret a string as a path. If you already have a string, you need to cast it tobytesto make it explicit that it's the payload, not a path. This cast is very cheap thanks to More flexible and efficientBytesrepresentation #5670.pathtosourceorsourcesin various places to account for the bytes case. This is a slight breaking change as the field name also changes on the elements.DataSourceenum is introduced to abstract over a path or bytes.Derived<S, D>type introduces a new way to parse data at element construction time without introducing ugly#[internal]fields.OneOrMultiple<T>type is used to handle one or multiple paths/bytes (e.g. for bibliography or syntaxes) and also is used in few other places.ManuallyHash<T>type is useful to have non-hashable fields in structs without having to implementHashmanually.SyntaxSetconstruction slightly changed (there is now one syntax set per user syntax because they are now parsed eagerly and the syntect API kinda forces us to), but I don't expect this to have visible behaviour (still noting it here in case I'm wrong).Issues