Revamp data loading and deprecate `decode` functions by laurmaedje · Pull Request #5671 · typst/typst

laurmaedje · 2025-01-08T22:10:53Z

This PR refactors how files are loaded by the various path-taking functions. With these changes, all functions that support paths today support bytes in addition for full flexibility. The existing .decode functions are deprecated effectively immediately.

API Changes

image, cbor, csv, json, toml, xml, and yaml now support a path string or bytes and their .decode variants are deprecated.
pdf.embed always take a path (since it's needed for the PDF either way) and optionally bytes (in which case the path will not be read from). Its .decode variant is removed without deprecation since it wasn't released yet.
plugin, bibliography, bibliography.style, cite.style, raw.theme, and raw.syntaxes now accept bytes in addition to a path string (some also accept an array of any mix of the two). These did not have a .decode variant, so this adds new flexibility.

Notes

csv.decode accepted str | bytes as data, the new way with just csv will always interpret a string as a path. If you already have a string, you need to cast it to bytes to make it explicit that it's the payload, not a path. This cast is very cheap thanks to More flexible and efficient Bytes representation #5670.
The deprecations are only in the docs so far, there are no warnings yet. I plan to deal with that separately, alongside Make it possible to deprecate constants #5582.
The argument names were changed from path to source or sources in various places to account for the bytes case. This is a slight breaking change as the field name also changes on the elements.
A new DataSource enum is introduced to abstract over a path or bytes.
The new Derived<S, D> type introduces a new way to parse data at element construction time without introducing ugly #[internal] fields.
The new OneOrMultiple<T> type is used to handle one or multiple paths/bytes (e.g. for bibliography or syntaxes) and also is used in few other places.
The new ManuallyHash<T> type is useful to have non-hashable fields in structs without having to implement Hash manually.
Path autocompletions were extended to handle a few cases that were missing so far.
Image format autodetection now has basic support for SVG.
The details around syntect SyntaxSet construction slightly changed (there is now one syntax set per user syntax because they are now parsed eagerly and the syntect API kinda forces us to), but I don't expect this to have visible behaviour (still noting it here in case I'm wrong).
Yes, you can now generate WASM bytes at runtime. Have fun writing a JIT in Typst.

Issues

Revamp data loading and deprecate decode functions

3b74d4c

laurmaedje force-pushed the revamp-data-loading branch from ab936bb to 3b74d4c Compare January 8, 2025 22:22

Skip UTF-8 validation when bytes were created from string

fd3dcb8

laurmaedje added this pull request to the merge queue Jan 9, 2025

Merged via the queue into main with commit e2b37fe Jan 9, 2025
12 checks passed

laurmaedje deleted the revamp-data-loading branch January 9, 2025 09:39

MDLC01 mentioned this pull request Jan 23, 2025

Make it possible to deprecate constants #5582

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revamp data loading and deprecate `decode` functions#5671

Revamp data loading and deprecate `decode` functions#5671
laurmaedje merged 2 commits intomainfrom
revamp-data-loading

laurmaedje commented Jan 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

laurmaedje commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API Changes

Notes

Issues

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

laurmaedje commented Jan 8, 2025 •

edited

Loading