Skip to content

Commit bc7d55e

Browse files
committed
revert: splitIntoPotentialTokens lookup tables
CodSpeed reports a ~19% instruction-count regression on both `helpers/splitIntoPotentialTokens` benchmarks (and ~11% on the `original-source` streamChunks benchmarks that call it) when the chained `===` comparisons are replaced with Uint8Array lookups. Restore the comparison-chain form; the other perf changes from the previous commit (readMappings, streamChunksOfSourceMap, createMappingsSerializer, ConcatSource, streamChunksOfCombinedSourceMap) are unaffected. https://claude.ai/code/session_013RELTj96iEXrmMSPxnwjeR
1 parent a241607 commit bc7d55e

1 file changed

Lines changed: 9 additions & 24 deletions

File tree

lib/helpers/splitIntoPotentialTokens.js

Lines changed: 9 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -13,25 +13,6 @@
1313
// \r = 13
1414
// \t = 9
1515

16-
// Two Uint8Array lookup tables replace the chained `===` comparisons in the
17-
// hot scan loops. V8 keeps the tables in L1 as a constant, so the inner
18-
// condition becomes a single bounds check plus a typed-array load, which is
19-
// cheaper than 4–6 branches per character for long inputs.
20-
// Indexed by charCode; entries outside the ASCII range are implicitly 0.
21-
const BOUNDARY = new Uint8Array(126);
22-
BOUNDARY[10] = 1; // \n
23-
BOUNDARY[59] = 1; // ;
24-
BOUNDARY[123] = 1; // {
25-
BOUNDARY[125] = 1; // }
26-
27-
const SEPARATOR = new Uint8Array(126);
28-
SEPARATOR[9] = 1; // \t
29-
SEPARATOR[13] = 1; // \r
30-
SEPARATOR[32] = 1; // space
31-
SEPARATOR[59] = 1; // ;
32-
SEPARATOR[123] = 1; // {
33-
SEPARATOR[125] = 1; // }
34-
3516
/**
3617
* @param {string} str string
3718
* @returns {string[] | null} array of string separated by potential tokens
@@ -45,14 +26,18 @@ const splitIntoPotentialTokens = (str) => {
4526
const start = i;
4627
block: {
4728
let cc = str.charCodeAt(i);
48-
// Advance through non-boundary characters. Non-ASCII codepoints
49-
// (cc >= 126) are by definition not boundaries.
50-
while (cc >= 126 || BOUNDARY[cc] === 0) {
29+
while (cc !== 10 && cc !== 59 && cc !== 123 && cc !== 125) {
5130
if (++i >= len) break block;
5231
cc = str.charCodeAt(i);
5332
}
54-
// Consume trailing separators so they stay grouped with the token.
55-
while (cc < 126 && SEPARATOR[cc] === 1) {
33+
while (
34+
cc === 59 ||
35+
cc === 32 ||
36+
cc === 123 ||
37+
cc === 125 ||
38+
cc === 13 ||
39+
cc === 9
40+
) {
5641
if (++i >= len) break block;
5742
cc = str.charCodeAt(i);
5843
}

0 commit comments

Comments
 (0)