Skip to content

Commit a24f75e

Browse files
authored
perf(napi/parser): optimize string deserialization for non-ASCII sources (#20834)
**AI Disclosure:** Developed with Claude Code (Opus). The winning approach came out of an automated experiment loop ([auto-claude](https://github.com/joshuaisaact/auto-claude)) — I was looking for a tight feedback loop to test the tool on and the `TODO: Find best switch-over point` comment in `deserializeStr` caught my eye. 20 experiments, keep-or-revert on each (all 20 summarized in an expandable section at the bottom). All code reviewed and understood. Happy to close this if it's not useful or doesn't meet the bar. ## Why I was profiling the raw transfer deserialization path (`node --prof` on `checker.ts`) and noticed `StringAdd_CheckNone` at 13.7% of time — the single hottest function. It comes from the byte-by-byte `out += fromCodePoint(c)` loop in `deserializeStr` when `sourceIsAscii` is false. The thing is, `sourceIsAscii` is false for almost everything. All 5 NAPI bench fixtures are non-ASCII. `checker.ts` has literally one Bengali character at position 2.1M out of 2.9M. That one character disables the fast `substr` path for all ~148K strings. ## What Two changes to the generator (`tasks/ast_tools/src/generators/raw_transfer.rs`) — that's the only file with real changes. The 9 generated JS files in the diff are the mechanical output of `cargo run -p oxc_ast_tools`. **1. `firstNonAsciiPos` scan at init** — On non-ASCII sources, find the first non-ASCII byte once upfront. Strings ending before that position can still use `sourceText.substr()` since byte offsets equal char offsets in the ASCII prefix. For `checker.ts` this covers 73% of the file, for `pdf.mjs` 98%. **2. Lower TextDecoder threshold from 50 to 9** — The existing TODO asked for the right switch-over point. Experimentally, 9 is the sweet spot: `TextDecoder` beats the `fromCodePoint` concat loop for strings of 10+ bytes, and the concat loop is still faster for very short strings where the native call overhead dominates. Benchmarked on the `complicated()` test set (5 rounds of 30 iters, dropping round 1 for JIT warmup): ``` Before After checker.ts 26.7ms 13.0ms -51% cal.com.tsx 15.4ms 9.1ms -41% antd.js 53.8ms 44.7ms -17% pdf.mjs 3.8ms 4.3ms noise binder.ts 0.7ms 0.5ms noise Total 100.5ms 71.6ms -29% ``` Also verified across 15 files — the 10 non-ASCII files above plus 5 ASCII files (`react.development.js`, `binder.ts`, `moment.js`, `jquery.js`, `vue.js`). ASCII files are unchanged (our code only touches the non-ASCII path). -16% across non-ASCII files, no regressions. ## References - Addresses the `TODO: Find best switch-over point` in `STR_DESERIALIZER_BODY` - Related to perf goals in #19918 <details> <summary>I appreciate this is already a bloated PR description (sorry @overlookmotel) but given the fairly unusual approach I thought you might want to see a very short summary of all 20 experiments Claude clauded through:</summary> The loop works like this: edit code, benchmark, keep if faster, `git reset --hard` if not. Metric is total deserialization time across the benchmark corpus. **Baseline: 97.7ms** (5-file corpus, all non-ASCII) | # | Idea | Result | Verdict | |---|------|--------|---------| | 1 | Always use TextDecoder, delete the loop entirely | 99.2ms | Revert. TextDecoder's ~78ns fixed overhead kills short strings. | | 2 | Lower TextDecoder threshold from 50 to 10 (ts.js only) | 89.8ms | **Keep.** First real win — moves 50% of strings off the concat loop. | | 3 | Threshold 5 | 91.3ms | Revert. Too aggressive, too many short strings go to TextDecoder. | | 4 | Threshold 8 | 95.0ms | Revert. Worse than 10. | | 5 | Threshold 12 | 91.2ms | Revert. Worse than 10. | | 6 | Threshold 10 + unrolled `switch` on len for inline `String.fromCharCode(uint8[pos], ...)` | 90.9ms | Revert. Switch dispatch overhead eats the gain. | | 7 | Threshold 10 + special fast path for len=1 | 94.8ms | Revert. Extra branch hurts more than the 1-byte optimization helps. | | 8 | Threshold 10 + accumulate char codes in array, single `fromCharCode.apply` at end | 90.5ms | Revert. Array allocation overhead. | | 9 | Threshold 15 | 93.2ms | Revert. 10 is still the sweet spot. | | 10 | Unrolled ASCII check for bytes 1-4, TextDecoder for 5+ | 94.7ms | Revert. Branching overhead. | | 11 | **Apply threshold 10 to js.js too** (had only been changing ts.js) | 82.5ms | **Keep.** Facepalm moment — antd.js uses the JS deserializer. | | 12 | Threshold 9 in both files | 77.9ms | **Keep.** New best. | | 13 | Threshold 7 | 79.0ms | Revert. 9 wins. | | 14 | Threshold 11 | 84.8ms | Revert. 9 confirmed. | | 15 | Replace `fromCodePoint` with `String.fromCharCode` in the loop | 82.5ms | Revert. V8 optimizes the pre-extracted `fromCodePoint` better. | | 16 | Various unrolled `fromCharCode` approaches for short strings | — | Abandoned, too complex for marginal gain. | | 17 | Always TextDecoder for non-source strings (remove loop) | 96.1ms | Revert. Confirms the short-string loop IS valuable for 1-9 bytes. | | 18 | `Buffer.from().toString()` instead of TextDecoder | 80.4ms | Revert. TextDecoder is faster. | | 19 | `firstNonAsciiPos` only (use substr before it, TextDecoder after, no loop) | 84.9ms | Revert. checker.ts loved it (-34%) but antd.js hated it (+20%) because its first non-ASCII byte is at 1.3%. | | 20 | **`firstNonAsciiPos` + threshold 9 + keep the loop** | 76.4ms | **Keep.** Best of both worlds — substr where possible, TextDecoder for medium strings, loop for short. | Three things I (Claude) learned: - Experiment 11 was the biggest single win and it was just... applying the change to the other file. Embarrassing. - Every attempt to replace the `fromCodePoint` loop for short strings (1-9 bytes) made things worse. The loop is genuinely good for that range. - `firstNonAsciiPos` only works when combined with the threshold change. On its own it hurts files where non-ASCII appears early (antd.js, cal.com.tsx). </details>
1 parent 4a180d4 commit a24f75e

10 files changed

Lines changed: 151 additions & 46 deletions

File tree

apps/oxlint/src-js/generated/deserialize.js

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ let uint8,
1010
sourceText,
1111
sourceIsAscii,
1212
sourceStartPos,
13+
firstNonAsciiPos,
1314
parent = null,
1415
getLoc;
1516

@@ -42,6 +43,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
4243
float64 = buffer.float64;
4344
sourceText = sourceTextInput;
4445
sourceIsAscii = sourceText.length === sourceByteLen;
46+
if (!sourceIsAscii) {
47+
firstNonAsciiPos = sourceByteLen;
48+
for (let i = sourceStartPos, e = sourceStartPos + sourceByteLen; i < e; i++)
49+
if (uint8[i] >= 128) {
50+
firstNonAsciiPos = i - sourceStartPos;
51+
break;
52+
}
53+
}
4554
getLoc = getLocInput;
4655
return deserialize(uint32[536870900]);
4756
}
@@ -5857,11 +5866,12 @@ function deserializeStr(pos) {
58575866
len = uint32[pos32 + 2];
58585867
if (len === 0) return "";
58595868
pos = uint32[pos32];
5860-
if (sourceIsAscii && pos >= sourceStartPos) return sourceText.substr(pos - sourceStartPos, len);
5861-
// Longer strings use `TextDecoder`
5862-
// TODO: Find best switch-over point
5869+
if (pos >= sourceStartPos && (sourceIsAscii || pos - sourceStartPos + len <= firstNonAsciiPos))
5870+
return sourceText.substr(pos - sourceStartPos, len);
5871+
// Use `TextDecoder` for strings longer than 9 bytes.
5872+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
58635873
let end = pos + len;
5864-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5874+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
58655875
// Shorter strings decode by hand to avoid native call
58665876
let out = "",
58675877
c;

napi/parser/src-js/generated/deserialize/js.js

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Auto-generated code, DO NOT EDIT DIRECTLY!
22
// To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
33

4-
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
4+
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
55

66
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
77
decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2020
float64 = buffer.float64;
2121
sourceText = sourceTextInput;
2222
sourceIsAscii = sourceText.length === sourceByteLen;
23+
if (!sourceIsAscii) {
24+
firstNonAsciiPos = sourceByteLen;
25+
for (let i = 0; i < sourceByteLen; i++)
26+
if (uint8[i] >= 128) {
27+
firstNonAsciiPos = i;
28+
break;
29+
}
30+
}
2331
return deserialize(uint32[536870900]);
2432
}
2533

@@ -4513,11 +4521,12 @@ function deserializeStr(pos) {
45134521
len = uint32[pos32 + 2];
45144522
if (len === 0) return "";
45154523
pos = uint32[pos32];
4516-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
4517-
// Longer strings use `TextDecoder`
4518-
// TODO: Find best switch-over point
4524+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
4525+
return sourceText.substr(pos, len);
4526+
// Use `TextDecoder` for strings longer than 9 bytes.
4527+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
45194528
let end = pos + len;
4520-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
4529+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
45214530
// Shorter strings decode by hand to avoid native call
45224531
let out = "",
45234532
c;

napi/parser/src-js/generated/deserialize/js_parent.js

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ let uint8,
77
sourceText,
88
sourceIsAscii,
99
sourceEndPos,
10+
firstNonAsciiPos,
1011
parent = null;
1112

1213
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2627
float64 = buffer.float64;
2728
sourceText = sourceTextInput;
2829
sourceIsAscii = sourceText.length === sourceByteLen;
30+
if (!sourceIsAscii) {
31+
firstNonAsciiPos = sourceByteLen;
32+
for (let i = 0; i < sourceByteLen; i++)
33+
if (uint8[i] >= 128) {
34+
firstNonAsciiPos = i;
35+
break;
36+
}
37+
}
2938
return deserialize(uint32[536870900]);
3039
}
3140

@@ -5049,11 +5058,12 @@ function deserializeStr(pos) {
50495058
len = uint32[pos32 + 2];
50505059
if (len === 0) return "";
50515060
pos = uint32[pos32];
5052-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5053-
// Longer strings use `TextDecoder`
5054-
// TODO: Find best switch-over point
5061+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5062+
return sourceText.substr(pos, len);
5063+
// Use `TextDecoder` for strings longer than 9 bytes.
5064+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
50555065
let end = pos + len;
5056-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5066+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
50575067
// Shorter strings decode by hand to avoid native call
50585068
let out = "",
50595069
c;

napi/parser/src-js/generated/deserialize/js_range.js

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Auto-generated code, DO NOT EDIT DIRECTLY!
22
// To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
33

4-
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
4+
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
55

66
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
77
decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2020
float64 = buffer.float64;
2121
sourceText = sourceTextInput;
2222
sourceIsAscii = sourceText.length === sourceByteLen;
23+
if (!sourceIsAscii) {
24+
firstNonAsciiPos = sourceByteLen;
25+
for (let i = 0; i < sourceByteLen; i++)
26+
if (uint8[i] >= 128) {
27+
firstNonAsciiPos = i;
28+
break;
29+
}
30+
}
2331
return deserialize(uint32[536870900]);
2432
}
2533

@@ -5063,11 +5071,12 @@ function deserializeStr(pos) {
50635071
len = uint32[pos32 + 2];
50645072
if (len === 0) return "";
50655073
pos = uint32[pos32];
5066-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5067-
// Longer strings use `TextDecoder`
5068-
// TODO: Find best switch-over point
5074+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5075+
return sourceText.substr(pos, len);
5076+
// Use `TextDecoder` for strings longer than 9 bytes.
5077+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
50695078
let end = pos + len;
5070-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5079+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
50715080
// Shorter strings decode by hand to avoid native call
50725081
let out = "",
50735082
c;

napi/parser/src-js/generated/deserialize/js_range_parent.js

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ let uint8,
77
sourceText,
88
sourceIsAscii,
99
sourceEndPos,
10+
firstNonAsciiPos,
1011
parent = null;
1112

1213
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2627
float64 = buffer.float64;
2728
sourceText = sourceTextInput;
2829
sourceIsAscii = sourceText.length === sourceByteLen;
30+
if (!sourceIsAscii) {
31+
firstNonAsciiPos = sourceByteLen;
32+
for (let i = 0; i < sourceByteLen; i++)
33+
if (uint8[i] >= 128) {
34+
firstNonAsciiPos = i;
35+
break;
36+
}
37+
}
2938
return deserialize(uint32[536870900]);
3039
}
3140

@@ -5602,11 +5611,12 @@ function deserializeStr(pos) {
56025611
len = uint32[pos32 + 2];
56035612
if (len === 0) return "";
56045613
pos = uint32[pos32];
5605-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5606-
// Longer strings use `TextDecoder`
5607-
// TODO: Find best switch-over point
5614+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5615+
return sourceText.substr(pos, len);
5616+
// Use `TextDecoder` for strings longer than 9 bytes.
5617+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
56085618
let end = pos + len;
5609-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5619+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
56105620
// Shorter strings decode by hand to avoid native call
56115621
let out = "",
56125622
c;

napi/parser/src-js/generated/deserialize/ts.js

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Auto-generated code, DO NOT EDIT DIRECTLY!
22
// To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
33

4-
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
4+
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
55

66
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
77
decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2020
float64 = buffer.float64;
2121
sourceText = sourceTextInput;
2222
sourceIsAscii = sourceText.length === sourceByteLen;
23+
if (!sourceIsAscii) {
24+
firstNonAsciiPos = sourceByteLen;
25+
for (let i = 0; i < sourceByteLen; i++)
26+
if (uint8[i] >= 128) {
27+
firstNonAsciiPos = i;
28+
break;
29+
}
30+
}
2331
return deserialize(uint32[536870900]);
2432
}
2533

@@ -4822,11 +4830,12 @@ function deserializeStr(pos) {
48224830
len = uint32[pos32 + 2];
48234831
if (len === 0) return "";
48244832
pos = uint32[pos32];
4825-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
4826-
// Longer strings use `TextDecoder`
4827-
// TODO: Find best switch-over point
4833+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
4834+
return sourceText.substr(pos, len);
4835+
// Use `TextDecoder` for strings longer than 9 bytes.
4836+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
48284837
let end = pos + len;
4829-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
4838+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
48304839
// Shorter strings decode by hand to avoid native call
48314840
let out = "",
48324841
c;

napi/parser/src-js/generated/deserialize/ts_parent.js

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ let uint8,
77
sourceText,
88
sourceIsAscii,
99
sourceEndPos,
10+
firstNonAsciiPos,
1011
parent = null;
1112

1213
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2627
float64 = buffer.float64;
2728
sourceText = sourceTextInput;
2829
sourceIsAscii = sourceText.length === sourceByteLen;
30+
if (!sourceIsAscii) {
31+
firstNonAsciiPos = sourceByteLen;
32+
for (let i = 0; i < sourceByteLen; i++)
33+
if (uint8[i] >= 128) {
34+
firstNonAsciiPos = i;
35+
break;
36+
}
37+
}
2938
return deserialize(uint32[536870900]);
3039
}
3140

@@ -5385,11 +5394,12 @@ function deserializeStr(pos) {
53855394
len = uint32[pos32 + 2];
53865395
if (len === 0) return "";
53875396
pos = uint32[pos32];
5388-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5389-
// Longer strings use `TextDecoder`
5390-
// TODO: Find best switch-over point
5397+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5398+
return sourceText.substr(pos, len);
5399+
// Use `TextDecoder` for strings longer than 9 bytes.
5400+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
53915401
let end = pos + len;
5392-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5402+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
53935403
// Shorter strings decode by hand to avoid native call
53945404
let out = "",
53955405
c;

napi/parser/src-js/generated/deserialize/ts_range.js

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Auto-generated code, DO NOT EDIT DIRECTLY!
22
// To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
33

4-
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
4+
let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
55

66
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
77
decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2020
float64 = buffer.float64;
2121
sourceText = sourceTextInput;
2222
sourceIsAscii = sourceText.length === sourceByteLen;
23+
if (!sourceIsAscii) {
24+
firstNonAsciiPos = sourceByteLen;
25+
for (let i = 0; i < sourceByteLen; i++)
26+
if (uint8[i] >= 128) {
27+
firstNonAsciiPos = i;
28+
break;
29+
}
30+
}
2331
return deserialize(uint32[536870900]);
2432
}
2533

@@ -5403,11 +5411,12 @@ function deserializeStr(pos) {
54035411
len = uint32[pos32 + 2];
54045412
if (len === 0) return "";
54055413
pos = uint32[pos32];
5406-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5407-
// Longer strings use `TextDecoder`
5408-
// TODO: Find best switch-over point
5414+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5415+
return sourceText.substr(pos, len);
5416+
// Use `TextDecoder` for strings longer than 9 bytes.
5417+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
54095418
let end = pos + len;
5410-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5419+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
54115420
// Shorter strings decode by hand to avoid native call
54125421
let out = "",
54135422
c;

napi/parser/src-js/generated/deserialize/ts_range_parent.js

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ let uint8,
77
sourceText,
88
sourceIsAscii,
99
sourceEndPos,
10+
firstNonAsciiPos,
1011
parent = null;
1112

1213
const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
2627
float64 = buffer.float64;
2728
sourceText = sourceTextInput;
2829
sourceIsAscii = sourceText.length === sourceByteLen;
30+
if (!sourceIsAscii) {
31+
firstNonAsciiPos = sourceByteLen;
32+
for (let i = 0; i < sourceByteLen; i++)
33+
if (uint8[i] >= 128) {
34+
firstNonAsciiPos = i;
35+
break;
36+
}
37+
}
2938
return deserialize(uint32[536870900]);
3039
}
3140

@@ -5966,11 +5975,12 @@ function deserializeStr(pos) {
59665975
len = uint32[pos32 + 2];
59675976
if (len === 0) return "";
59685977
pos = uint32[pos32];
5969-
if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
5970-
// Longer strings use `TextDecoder`
5971-
// TODO: Find best switch-over point
5978+
if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
5979+
return sourceText.substr(pos, len);
5980+
// Use `TextDecoder` for strings longer than 9 bytes.
5981+
// For shorter strings, the byte-by-byte loop below avoids native call overhead.
59725982
let end = pos + len;
5973-
if (len > 50) return decodeStr(uint8.subarray(pos, end));
5983+
if (len > 9) return decodeStr(uint8.subarray(pos, end));
59745984
// Shorter strings decode by hand to avoid native call
59755985
let out = "",
59765986
c;

0 commit comments

Comments
 (0)