separate decode from string/bytes for all data functions; and encode for json, toml, yaml via serde#1935
Conversation
|
I thought about removing the convert_*-logic out of data.rs into a real Deserialize-Implementation |
If that works, that'd be great, especially considering the existing Serialize implementation. |
|
There are a few problems:
Maybe we should use some kind of enum tagging for all non-native types (numbers, strings, bool, none)? |
I think that's fine to be honest. You don't really want to serialize a large byte buffer to a JSON array, it will be horribly slow (which is precisely why it's serialized that way, otherwise
We could but I'm not sure it would be good. It is not to be expected that serialize -> deserialize roundtrip of arbitrary Typst values is lossless. I think adding tags would be quite heavy. For plugin usage, you generally want to have your input data expressed mostly in terms of primitives that just work with JSON.
I am not familiar with the |
With that said (and going on a bit of a tangent), would it make sense to add native BSON conversion to Typst (with the same API as json(), including the proposed functions from this PR)? Perhaps that could be yet another handy tool for plugin creators. |
I would be ok with that. Although I wonder if bson or bincode is the more "blessed" binary encoding. |
|
I'm just experimenting with the following: JSON {
"a": 1,
"b": [
1,
2,
3.1415,
"Test"
],
"c": {
"func": "text",
"text": "Aha"
},
"d": [
true,
false
],
"e": null,
"f": [
"28.35pt",
"90deg"
]
}TOML a=1
b=[
1,
2,
3.1415,
"Test",
]
d=[
true,
false,
]
[c]
func = "text"
text = "Aha"
f = ["28.35pt", "90deg"]YAML a: 1
b:
-1
-2
- 3.1415
- Test
c: !Content
func: text
text: Aha
d:
- true
- false
e: null
f:
- !length 28.35pt
- !angle 90deg |
|
Interesting, I didn't know of that YAML feature. |
|
Is there a convenient way to get back from a repr()-string to a Value? I would use eval(), but don't have a World. |
There was a problem hiding this comment.
@PgBiel do you still think that allowing bytes as input to json.decode and friends is problematic? because it's currently implemented that way. one upside is that this way it is more efficient to deserialize plugin output. converting to a string has not only the UTF-8 check but also a full copy from a wonderful prehashed byte buffer to a non-prehashed EcoString.
No. But this would only be useful for the tagged YAML thing anyway right? Because otherwise you can't know whether it was just a normal string. To be honest, I think nobody will use the tagged YAML thing, I'd just skip it and not try to do anything smart with unrepr-ing a string. repr also doesn't always produce evaluatable output. |
I think it's fine (I'm guessing it will check for UTF-8 validity either way). Ship it 🚀 (Although string should still be an option of course) |
I think we can have support for both in the future if there is demand for that. I just thought BSON (for now) would be more generic and probably easier to use across different languages. |
|
The discussion might be a different one, but the implementation is related. |
Of course, but the PR becomes easier to approve and reason about when it's more self-contained. |
|
I tend to agree that a new format should be a separate PR. |
|
2 : 1 ok. ;-) |
|
Since the separated PR with new format(s) will depend on this, I will place it when this one is done. |
|
Please consider #1997 to calm clippy down even more and get a test pass... |
|
Thank you! |
Resolves typst#6738 - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_) and add tests. - Update docs related to data loading and `repr`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html)
Resolves typst#6738 - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_) and add tests. - Update docs related to data loading and `repr`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html)
Resolves typst#6738 - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_) and add tests. - Update docs related to data loading and `repr`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html)
Resolves typst#6738 - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_) and add tests. - Update docs related to data loading and `repr`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Change `serialize_str(bytes)` from `Debug::fmt` (_Bytes(n)_) to `repr` (_bytes(n)_). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Add a _Conversion details_ section to each format. - Mention that `cbor.encode` uses ciborium, and other implementations may not be able to parse its result. - Mention that `*.encode` may fall back to `repr`, and explain common confusions. - Fix a few copy-and-paste errors. - Use the terms of each format. For instance, JSON object, YAML mapping, TOML table, CBOR map. (People are really good at coining names.) - Change `serialize_str(bytes)` from `Debug::fmt` to `repr`. That is, _Bytes(n)_ → _bytes(n)_ for human readable formats (JSON, YAML, TOML). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. Because TOML documents can only be tables. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html) - [ciborium 0.2.2](https://docs.rs/ciborium/0.2.2/ciborium/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Add a _Conversion details_ section to each format. - Mention that `cbor.encode` uses ciborium, and other implementations may not be able to parse its result. - Mention that `*.encode` may fall back to `repr`, and explain common confusions. - Fix a few copy-and-paste errors. - Use the terms of each format. For instance, JSON object, YAML mapping, TOML table, CBOR map. (People are really good at coining names.) - Change `serialize_str(bytes)` from `Debug::fmt` to `repr`. That is, _Bytes(n)_ → _bytes(n)_ for human readable formats (JSON, YAML, TOML). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. Because TOML documents can only be tables. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html) - [ciborium 0.2.2](https://docs.rs/ciborium/0.2.2/ciborium/enum.Value.html)
Resolves typst#6738 - Update docs related to data loading and `repr`. - Add a _Conversion details_ section to each format. - Mention that `cbor.encode` uses ciborium, and other implementations may not be able to parse its result. - Mention that `*.encode` may fall back to `repr`, and explain common confusions. - Fix a few copy-and-paste errors. - Use the terms of each format. For instance, JSON object, YAML mapping, TOML table, CBOR map. (People are really good at coining names.) - Change `serialize_str(bytes)` from `Debug::fmt` to `repr`. That is, _Bytes(n)_ → _bytes(n)_ for human readable formats (JSON, YAML, TOML). - Narrow the input of `toml.decode` and the output of `toml` from `Value` to `Dict`. Because TOML documents can only be tables. References: - Initial discussions in typst#1935 - [serde_json 1.0.138](https://docs.rs/serde_json/1.0.138/serde_json/value/enum.Value.html) - [serde_yaml 0.8.26](https://docs.rs/serde_yaml/0.8.26/serde_yaml/enum.Value.html) - [toml 0.8.19](https://docs.rs/toml/0.8.19/toml/enum.Value.html) - [ciborium 0.2.2](https://docs.rs/ciborium/0.2.2/ciborium/enum.Value.html)
resolves #1647. This allows for data decoding also from string and bytes + encoding json, toml and yaml. The new functions are within the original scope, e.g.