You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -67,7 +67,7 @@ TOON excels with uniform arrays of objects, but there are cases where other form
67
67
68
68
-**Deeply nested or non-uniform structures** (tabular eligibility ≈ 0%): JSON-compact often uses fewer tokens. Example: complex configuration objects with many nested levels.
69
69
-**Semi-uniform arrays** (~40–60% tabular eligibility): Token savings diminish. Prefer JSON if your pipelines already rely on it.
70
-
-**Flat CSV use-cases**: CSV is smaller than TOON for pure tabular data. TOON adds minimal overhead (~5-10%) to provide structure (length markers, field headers, delimiter scoping) that improves LLM reliability.
70
+
-**Flat CSV use-cases**: CSV is smaller than TOON for pure tabular data. TOON adds minimal overhead (~5-10%) to provide structure (array length declarations, field headers, delimiter scoping) that improves LLM reliability.
71
71
72
72
See [benchmarks](#benchmarks) for concrete comparisons across different data structures.
73
73
@@ -80,7 +80,7 @@ See [benchmarks](#benchmarks) for concrete comparisons across different data str
- 📐 **Indentation-based structure:** like YAML, uses whitespace instead of braces
82
82
- 🧺 **Tabular arrays:** declare keys once, stream data as rows
83
-
- 🔗 **Optional key folding (spec v1.5):** collapses single-key wrapper chains into dotted paths (e.g., `data.metadata.items`) to reduce indentation and tokens
83
+
- 🔗 **Optional key folding:** collapses single-key wrapper chains into dotted paths (e.g., `data.metadata.items`) to reduce indentation and tokens
84
84
85
85
[^1]: For flat tabular data, CSV is more compact. TOON adds minimal overhead to provide explicit structure and validation that improves LLM reliability.
Converts a TOON-formatted string back to JavaScript values.
@@ -1179,7 +1146,7 @@ By default, the decoder validates input strictly:
1179
1146
- Format familiarity and structure matter as much as token count. TOON's tabular format requires arrays of objects with identical keys and primitive values only. When this doesn't hold (due to mixed types, non-uniform objects, or nested structures), TOON switches to list format where JSON can be more efficient at scale.
1180
1147
-**TOON excels at:** Uniform arrays of objects (same fields, primitive values), especially large datasets with consistent structure.
1181
1148
-**JSON is better for:** Non-uniform data, deeply nested structures, and objects with varying field sets.
1182
-
-**CSV is more compact for:** Flat, uniform tables without nesting. TOON adds structure (`[N]`length markers, delimiter scoping, deterministic quoting) that improves LLM reliability with minimal token overhead.
1149
+
-**CSV is more compact for:** Flat, uniform tables without nesting. TOON adds structure (`[N]`array lengths, delimiter scoping, deterministic quoting) that improves LLM reliability with minimal token overhead.
1183
1150
-**Token counts vary by tokenizer and model.** Benchmarks use a GPT-style tokenizer (cl100k/o200k); actual savings will differ with other models (e.g., [SentencePiece](https://github.com/google/sentencepiece)).
1184
1151
-**TOON is designed for LLM input** where human readability and token efficiency matter. It's **not** a drop-in replacement for JSON in APIs or storage.
1185
1152
@@ -1189,7 +1156,7 @@ TOON works best when you show the format instead of describing it. The structure
1189
1156
1190
1157
### Sending TOON to LLMs (Input)
1191
1158
1192
-
Wrap your encoded data in a fenced code block (label it \`\`\`toon for clarity). The indentation and headers are usually enough – models treat it like familiar YAML or CSV. The explicit length markers (`[N]`) and field headers (`{field1,field2}`) help the model track structure, especially for large tables.
1159
+
Wrap your encoded data in a fenced code block (label it \`\`\`toon for clarity). The indentation and headers are usually enough – models treat it like familiar YAML or CSV. The explicit array lengths (`[N]`) and field headers (`{field1,field2}`) help the model track structure, especially for large tables.
1193
1160
1194
1161
### Generating TOON from LLMs (Output)
1195
1162
@@ -1267,7 +1234,7 @@ Task: Return only users with role "user" as TOON. Use the same header. Set [N] t
1267
1234
## Other Implementations
1268
1235
1269
1236
> [!NOTE]
1270
-
> When implementing TOON in other languages, please follow the [specification](https://github.com/toon-format/spec/blob/main/SPEC.md) (currently v1.5) to ensure compatibility across implementations. The [conformance tests](https://github.com/toon-format/spec/tree/main/tests) provide language-agnostic test fixtures that validate your implementations.
1237
+
> When implementing TOON in other languages, please follow the [specification](https://github.com/toon-format/spec/blob/main/SPEC.md) (currently v2.0) to ensure compatibility across implementations. The [conformance tests](https://github.com/toon-format/spec/tree/main/tests) provide language-agnostic test fixtures that validate your implementations.
0 commit comments