perf(napi/parser): optimize string deserialization for non-ASCII sources (#20834)

joshuaisaact · web-flow · commit a24f75e9e8c2 · 2026-03-30T16:19:17.000+01:00
**AI Disclosure:** Developed with Claude Code (Opus). The winning approach came out of an automated experiment loop ([auto-claude](https://github.com/joshuaisaact/auto-claude)) — I was looking for a tight feedback loop to test the tool on and the `TODO: Find best switch-over point` comment in `deserializeStr` caught my eye. 20 experiments, keep-or-revert on each (all 20 summarized in an expandable section at the bottom). All code reviewed and understood. Happy to close this if it's not useful or doesn't meet the bar. ## Why I was profiling the raw transfer deserialization path (`node --prof` on `checker.ts`) and noticed `StringAdd_CheckNone` at 13.7% of time — the single hottest function. It comes from the byte-by-byte `out += fromCodePoint(c)` loop in `deserializeStr` when `sourceIsAscii` is false. The thing is, `sourceIsAscii` is false for almost everything. All 5 NAPI bench fixtures are non-ASCII. `checker.ts` has literally one Bengali character at position 2.1M out of 2.9M. That one character disables the fast `substr` path for all ~148K strings. ## What Two changes to the generator (`tasks/ast_tools/src/generators/raw_transfer.rs`) — that's the only file with real changes. The 9 generated JS files in the diff are the mechanical output of `cargo run -p oxc_ast_tools`. **1. `firstNonAsciiPos` scan at init** — On non-ASCII sources, find the first non-ASCII byte once upfront. Strings ending before that position can still use `sourceText.substr()` since byte offsets equal char offsets in the ASCII prefix. For `checker.ts` this covers 73% of the file, for `pdf.mjs` 98%. **2. Lower TextDecoder threshold from 50 to 9** — The existing TODO asked for the right switch-over point. Experimentally, 9 is the sweet spot: `TextDecoder` beats the `fromCodePoint` concat loop for strings of 10+ bytes, and the concat loop is still faster for very short strings where the native call overhead dominates. Benchmarked on the `complicated()` test set (5 rounds of 30 iters, dropping round 1 for JIT warmup): ``` Before After checker.ts 26.7ms 13.0ms -51% cal.com.tsx 15.4ms 9.1ms -41% antd.js 53.8ms 44.7ms -17% pdf.mjs 3.8ms 4.3ms noise binder.ts 0.7ms 0.5ms noise Total 100.5ms 71.6ms -29% ``` Also verified across 15 files — the 10 non-ASCII files above plus 5 ASCII files (`react.development.js`, `binder.ts`, `moment.js`, `jquery.js`, `vue.js`). ASCII files are unchanged (our code only touches the non-ASCII path). -16% across non-ASCII files, no regressions. ## References - Addresses the `TODO: Find best switch-over point` in `STR_DESERIALIZER_BODY` - Related to perf goals in #19918 <details> <summary>I appreciate this is already a bloated PR description (sorry @overlookmotel) but given the fairly unusual approach I thought you might want to see a very short summary of all 20 experiments Claude clauded through:</summary> The loop works like this: edit code, benchmark, keep if faster, `git reset --hard` if not. Metric is total deserialization time across the benchmark corpus. **Baseline: 97.7ms** (5-file corpus, all non-ASCII) | # | Idea | Result | Verdict | |---|------|--------|---------| | 1 | Always use TextDecoder, delete the loop entirely | 99.2ms | Revert. TextDecoder's ~78ns fixed overhead kills short strings. | | 2 | Lower TextDecoder threshold from 50 to 10 (ts.js only) | 89.8ms | **Keep.** First real win — moves 50% of strings off the concat loop. | | 3 | Threshold 5 | 91.3ms | Revert. Too aggressive, too many short strings go to TextDecoder. | | 4 | Threshold 8 | 95.0ms | Revert. Worse than 10. | | 5 | Threshold 12 | 91.2ms | Revert. Worse than 10. | | 6 | Threshold 10 + unrolled `switch` on len for inline `String.fromCharCode(uint8[pos], ...)` | 90.9ms | Revert. Switch dispatch overhead eats the gain. | | 7 | Threshold 10 + special fast path for len=1 | 94.8ms | Revert. Extra branch hurts more than the 1-byte optimization helps. | | 8 | Threshold 10 + accumulate char codes in array, single `fromCharCode.apply` at end | 90.5ms | Revert. Array allocation overhead. | | 9 | Threshold 15 | 93.2ms | Revert. 10 is still the sweet spot. | | 10 | Unrolled ASCII check for bytes 1-4, TextDecoder for 5+ | 94.7ms | Revert. Branching overhead. | | 11 | **Apply threshold 10 to js.js too** (had only been changing ts.js) | 82.5ms | **Keep.** Facepalm moment — antd.js uses the JS deserializer. | | 12 | Threshold 9 in both files | 77.9ms | **Keep.** New best. | | 13 | Threshold 7 | 79.0ms | Revert. 9 wins. | | 14 | Threshold 11 | 84.8ms | Revert. 9 confirmed. | | 15 | Replace `fromCodePoint` with `String.fromCharCode` in the loop | 82.5ms | Revert. V8 optimizes the pre-extracted `fromCodePoint` better. | | 16 | Various unrolled `fromCharCode` approaches for short strings | — | Abandoned, too complex for marginal gain. | | 17 | Always TextDecoder for non-source strings (remove loop) | 96.1ms | Revert. Confirms the short-string loop IS valuable for 1-9 bytes. | | 18 | `Buffer.from().toString()` instead of TextDecoder | 80.4ms | Revert. TextDecoder is faster. | | 19 | `firstNonAsciiPos` only (use substr before it, TextDecoder after, no loop) | 84.9ms | Revert. checker.ts loved it (-34%) but antd.js hated it (+20%) because its first non-ASCII byte is at 1.3%. | | 20 | **`firstNonAsciiPos` + threshold 9 + keep the loop** | 76.4ms | **Keep.** Best of both worlds — substr where possible, TextDecoder for medium strings, loop for short. | Three things I (Claude) learned: - Experiment 11 was the biggest single win and it was just... applying the change to the other file. Embarrassing. - Every attempt to replace the `fromCodePoint` loop for short strings (1-9 bytes) made things worse. The loop is genuinely good for that range. - `firstNonAsciiPos` only works when combined with the threshold change. On its own it hurts files where non-ASCII appears early (antd.js, cal.com.tsx). </details>
diff --git a/apps/oxlint/src-js/generated/deserialize.js b/apps/oxlint/src-js/generated/deserialize.js
@@ -10,6 +10,7 @@ let uint8,
   sourceText,
   sourceIsAscii,
   sourceStartPos,
+  firstNonAsciiPos,
   parent = null,
   getLoc;
 
@@ -42,6 +43,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = sourceStartPos, e = sourceStartPos + sourceByteLen; i < e; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i - sourceStartPos;
+        break;
+      }
+  }
   getLoc = getLocInput;
   return deserialize(uint32[536870900]);
 }
@@ -5857,11 +5866,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos >= sourceStartPos) return sourceText.substr(pos - sourceStartPos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos >= sourceStartPos && (sourceIsAscii || pos - sourceStartPos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos - sourceStartPos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/js.js b/napi/parser/src-js/generated/deserialize/js.js
@@ -1,7 +1,7 @@
 // Auto-generated code, DO NOT EDIT DIRECTLY!
 // To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
 
-let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
+let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
   decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -4513,11 +4521,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/js_parent.js b/napi/parser/src-js/generated/deserialize/js_parent.js
@@ -7,6 +7,7 @@ let uint8,
   sourceText,
   sourceIsAscii,
   sourceEndPos,
+  firstNonAsciiPos,
   parent = null;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5049,11 +5058,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/js_range.js b/napi/parser/src-js/generated/deserialize/js_range.js
@@ -1,7 +1,7 @@
 // Auto-generated code, DO NOT EDIT DIRECTLY!
 // To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
 
-let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
+let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
   decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5063,11 +5071,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/js_range_parent.js b/napi/parser/src-js/generated/deserialize/js_range_parent.js
@@ -7,6 +7,7 @@ let uint8,
   sourceText,
   sourceIsAscii,
   sourceEndPos,
+  firstNonAsciiPos,
   parent = null;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5602,11 +5611,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/ts.js b/napi/parser/src-js/generated/deserialize/ts.js
@@ -1,7 +1,7 @@
 // Auto-generated code, DO NOT EDIT DIRECTLY!
 // To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
 
-let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
+let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
   decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -4822,11 +4830,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/ts_parent.js b/napi/parser/src-js/generated/deserialize/ts_parent.js
@@ -7,6 +7,7 @@ let uint8,
   sourceText,
   sourceIsAscii,
   sourceEndPos,
+  firstNonAsciiPos,
   parent = null;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5385,11 +5394,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/ts_range.js b/napi/parser/src-js/generated/deserialize/ts_range.js
@@ -1,7 +1,7 @@
 // Auto-generated code, DO NOT EDIT DIRECTLY!
 // To edit this generated file you have to edit `tasks/ast_tools/src/generators/raw_transfer.rs`.
 
-let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos;
+let uint8, uint32, float64, sourceText, sourceIsAscii, sourceEndPos, firstNonAsciiPos;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
   decodeStr = textDecoder.decode.bind(textDecoder),
@@ -20,6 +20,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5403,11 +5411,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/napi/parser/src-js/generated/deserialize/ts_range_parent.js b/napi/parser/src-js/generated/deserialize/ts_range_parent.js
@@ -7,6 +7,7 @@ let uint8,
   sourceText,
   sourceIsAscii,
   sourceEndPos,
+  firstNonAsciiPos,
   parent = null;
 
 const textDecoder = new TextDecoder("utf-8", { ignoreBOM: true }),
@@ -26,6 +27,14 @@ function deserializeWith(buffer, sourceTextInput, sourceByteLen, getLocInput, de
   float64 = buffer.float64;
   sourceText = sourceTextInput;
   sourceIsAscii = sourceText.length === sourceByteLen;
+  if (!sourceIsAscii) {
+    firstNonAsciiPos = sourceByteLen;
+    for (let i = 0; i < sourceByteLen; i++)
+      if (uint8[i] >= 128) {
+        firstNonAsciiPos = i;
+        break;
+      }
+  }
   return deserialize(uint32[536870900]);
 }
 
@@ -5966,11 +5975,12 @@ function deserializeStr(pos) {
     len = uint32[pos32 + 2];
   if (len === 0) return "";
   pos = uint32[pos32];
-  if (sourceIsAscii && pos < sourceEndPos) return sourceText.substr(pos, len);
-  // Longer strings use `TextDecoder`
-  // TODO: Find best switch-over point
+  if (pos < sourceEndPos && (sourceIsAscii || pos + len <= firstNonAsciiPos))
+    return sourceText.substr(pos, len);
+  // Use `TextDecoder` for strings longer than 9 bytes.
+  // For shorter strings, the byte-by-byte loop below avoids native call overhead.
   let end = pos + len;
-  if (len > 50) return decodeStr(uint8.subarray(pos, end));
+  if (len > 9) return decodeStr(uint8.subarray(pos, end));
   // Shorter strings decode by hand to avoid native call
   let out = "",
     c;
diff --git a/tasks/ast_tools/src/generators/raw_transfer.rs b/tasks/ast_tools/src/generators/raw_transfer.rs