perf(minifier): drop per-call buffers in try_fold_concat#22596
Conversation
627742a to
0e9f810
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
4a56de6 to
9357205
Compare
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Appends the comprehensive synthetic fixture just merged at oxc-project/benchmark-files@main/kitchen-sink.tsx to the existing 4-file bench input set (RadixUIAdoptionSection / react.development / cal.com / binder) without replacing any of them. Also adds it to TestFiles::complicated() so alloc-tracking picks it up. Fixture stats (cross-checked): 21171 lines / 735 KB / 133222 AST nodes / 4757 scopes / 7016 symbols / 15983 resolved references / 177 classes / 0 semantic errors. Hand-written and intentionally diverse across every AST node, transformer plugin, minifier optimization, and semantic step in oxc — designed to surface general-purpose perf wins above the ~1-2% noise floor that recent niche PRs got lost in (#22596 already verified: minifier -1.5% mean / -3.7% min). AI disclosure: drafted with Claude Code, reviewed manually.
Appends the comprehensive synthetic fixture just merged at oxc-project/benchmark-files@main/kitchen-sink.tsx to the existing 4-file bench input set (RadixUIAdoptionSection / react.development / cal.com / binder) without replacing any of them. Also adds it to TestFiles::complicated() so alloc-tracking picks it up. Fixture stats (cross-checked): 21171 lines / 735 KB / 133222 AST nodes / 4757 scopes / 7016 symbols / 15983 resolved references / 177 classes / 0 semantic errors. Hand-written and intentionally diverse across every AST node, transformer plugin, minifier optimization, and semantic step in oxc — designed to surface general-purpose perf wins above the ~1-2% noise floor that recent niche PRs got lost in (#22596 already verified: minifier -1.5% mean / -3.7% min). AI disclosure: drafted with Claude Code, reviewed manually.
9357205 to
1b51e68
Compare
3a7691e to
f1886cc
Compare
Appends the comprehensive synthetic fixture just merged at oxc-project/benchmark-files@main/kitchen-sink.tsx to the existing 4-file bench input set (RadixUIAdoptionSection / react.development / cal.com / binder) without replacing any of them. Also adds it to TestFiles::complicated() so alloc-tracking picks it up. Fixture stats (cross-checked): 21171 lines / 735 KB / 133222 AST nodes / 4757 scopes / 7016 symbols / 15983 resolved references / 177 classes / 0 semantic errors. Hand-written and intentionally diverse across every AST node, transformer plugin, minifier optimization, and semantic step in oxc — designed to surface general-purpose perf wins above the ~1-2% noise floor that recent niche PRs got lost in (#22596 already verified: minifier -1.5% mean / -3.7% min). AI disclosure: drafted with Claude Code, reviewed manually.
f1886cc to
eb76d6d
Compare
1b51e68 to
13a600f
Compare
13a600f to
b7760b1
Compare
eb76d6d to
929db20
Compare
929db20 to
b6085db
Compare
13c1b29 to
aa95145
Compare
ee83fb5 to
4439904
Compare
c17a6f3 to
aa95145
Compare
4439904 to
494d5dc
Compare
aa95145 to
353e4fd
Compare
Appends the comprehensive synthetic fixture just merged at oxc-project/benchmark-files@main/kitchen-sink.tsx to the existing 4-file bench input set (RadixUIAdoptionSection / react.development / cal.com / binder) without replacing any of them. Also adds it to TestFiles::complicated() so alloc-tracking picks it up. Fixture stats (cross-checked): 21171 lines / 735 KB / 133222 AST nodes / 4757 scopes / 7016 symbols / 15983 resolved references / 177 classes / 0 semantic errors. Hand-written and intentionally diverse across every AST node, transformer plugin, minifier optimization, and semantic step in oxc — designed to surface general-purpose perf wins above the ~1-2% noise floor that recent niche PRs got lost in (#22596 already verified: minifier -1.5% mean / -3.7% min). AI disclosure: drafted with Claude Code, reviewed manually.
494d5dc to
d515956
Compare
353e4fd to
6e84c1c
Compare
## Summary Adds `kitchen-sink.tsx` — a comprehensive synthetic TypeScript+JSX fixture maintained at [oxc-project/benchmark-files](https://github.com/oxc-project/benchmark-files) — to both `TestFiles::minimal()` (bench input set) and `TestFiles::complicated()` (alloc-tracking input set). The existing files in each set are untouched; this is a strict append. ## Why The existing bench input set didn't reliably surface general-purpose perf wins above the ~1-2% measurement noise floor: - #22580 (semantic pre-reserve) — visible because `binder.ts` exercises it - #22594 (formatter buffer) — visible - #22596 (minifier `try_fold_concat`) — **not visible** on the old set - #22599 (semantic resolve-refs no-temp-Vec) — **not visible** - #22603 (semantic var-hoist SmallVec) — **not visible** The kitchen-sink fixes that by exercising every AST node, every transformer plugin, every minifier optimization opportunity, and every semantic step in one large file. Verified by re-benching #22596 against this fixture: **minifier mean −1.5%, min −3.7%** — above noise, signal confirmed. ## Fixture stats (cross-checked locally) | Metric | Value | |---|---| | Source size | 21,117 lines / 732.90 kB | | AST nodes | ~133,000 | | Scopes | ~4,750 | | Symbols | ~7,000 | | Resolved references | ~16,000 | | Semantic diagnostics | 0 errors / 0 warnings | ## Snap baselines `tasks/track_memory_allocations/allocs_*.snap` updated with the kitchen-sink row across all 5 pipelines (parser / semantic / transformer / minifier / formatter). Future PRs that change allocation behavior on this fixture will produce a snap diff in CI. ## Bench-cleaner fix `tasks/benchmark/benches/lexer.rs`'s `SourceCleaner` was missing `visit_ts_template_literal_type` — TypeScript type-level template literals (e.g. `` `${T}-${U}` `` in conditional / mapped types) are syntactically identical to value-level template literals, so the bench-mode lexer (without parser context) cannot distinguish them. Without the cleaner converting them to plain strings, kitchen-sink's type-level templates caused the lexer bench to swallow ~1 KB spans as a single `TemplateHead` and produce spurious "Unterminated string" / "Invalid Unicode escape" errors. One-line fix to mirror the existing `visit_template_literal` handling. AI disclosure: drafted with Claude Code, reviewed manually.
d515956 to
d6faed5
Compare
6e84c1c to
4e559e1
Compare
ba64d53 to
ac919bb
Compare
Merge activity
|
camc314
left a comment
There was a problem hiding this comment.
this feels slightly overkill lol
## Summary
When the minifier folds `.concat()` calls into template literals via `try_fold_concat` in `peephole/replace_known_methods.rs`, the original implementation built two short-lived intermediate buffers per call:
- `quasi_strs: Vec<Cow<'a, str>>` — fresh `std::Vec` allocation per call.
- A `String` inside each `Cow::Owned` — created when `Cow::to_mut()` cloned a `Borrowed` into an `Owned`, then grown via doubling as more arg strings were pushed in.
For files with many `.concat()` calls — common in bundled ES5 libraries where template literals weren't available — both structures were freshly allocated thousands of times per minify, with the `String` growing via doubling each time.
This PR replaces those per-call buffers with a single reusable scratch `String` held on `MinifierState`, and constructs `TemplateElement` AST nodes directly into the arena `Vec` as args are drained. The intermediate `Vec<Cow<'a, str>>` is gone entirely. After the first few folds the scratch's capacity stabilizes and subsequent folds are amortized zero std-heap allocs.
The state machine simplifies in the process: the `pushed_quasi: bool` flag is gone. Its invariant ("scratch holds an in-progress quasi") is naturally maintained because every expression flush clears the scratch, so a back-to-back expression's flush produces the required empty separator quasi without a branch.
## Allocation impact
`cargo allocs`, `allocs_minifier.snap` (baseline = parent commit, after resolve-refs and var-hoist landed):
| File | Size | Sys allocs before | Sys allocs after | Sys reallocs before | Sys reallocs after |
|---|---|---|---|---|---|
| `antd.js` | 6.69 MB ES5 | **4,652** | **3,084** (`−1,568`, `−33.7%`) | **1,622** | **53** (`−1,569`) |
| Other tracked files | various | unchanged | unchanged | unchanged | unchanged |
Only `antd.js` shows a change because it's the only tracked file that exercises `.concat()` folding heavily. ES5 bundled code uses `.concat()` instead of template literals; antd.js has thousands of these.
The headline `Sys reallocs` improvement (`1,622` → `53`) comes from eliminating the per-call `String` doubling. The headline `Sys allocs` improvement (`4,652` → `3,084`) comes from eliminating the per-call `Vec<Cow<'a, str>>` allocation.
## Commits
This PR contains two commits, kept separate for review history:
1. **`perf(minifier): pre-size buffers in try_fold_concat to eliminate growth reallocs`** — the original tactical fix that pre-sized the doomed-to-be-removed intermediate buffers. Dropped sys reallocs first but didn't touch sys allocs.
2. **`perf(minifier): drop per-call buffers in try_fold_concat`** — the root-cause refactor described above. Removes the intermediate buffers entirely. Cuts both columns.
Once the second commit lands, the pre-sizing in the first becomes dead-code (the buffers it pre-sized no longer exist). They're kept as separate commits so the diff against `main` is reviewable — feel free to squash on merge.
## How I found this
After #22580, profiling on `antd.js` minifier showed it had unusually high sys reallocs vs other tracked files. I wrote a backtrace-capturing `System` allocator wrapper and ran it on the minify path. The top realloc sites all converged on `try_fold_concat` — the overwhelming majority of captured reallocs came from this one function. After landing the pre-size fix, code review pushed for a root-cause fix that also addressed the `Sys allocs` column, which this refactor delivers.
## Timing
The minifier bench suite (originally) didn't include antd.js or other `.concat()`-heavy fixtures, so bench numbers locally were flat (no regression, no obvious improvement). With kitchen-sink in the bench input set (#22609), CodSpeed shows: **minifier mean −1.5%, min −3.7%** on kitchen-sink.
## Verification
- `cargo test -p oxc_minifier` — pass.
- `cargo minsize` — no size regressions (`minsize.snap` byte-identical).
- `cargo allocs` — only `antd.js` row changed in `allocs_minifier.snap`.
- `cargo fmt -p oxc_minifier` clean.
AI disclosure: drafted with Claude Code, reviewed manually.
d6faed5 to
4f289f1
Compare
ac919bb to
04d3065
Compare
### 🚀 Features - e857b0c napi/minify: Expose legalComments option and result (#20370) (Boshen) - 661132d parser: More friendly error messages for rest assignment target and rest binding element (#22719) (sapphi-red) - ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option for nullable-union design:type (#22266) (Kyle Cannon) ### 🐛 Bug Fixes - e1d064e transformer/class-properties: Reparent lifted private method helpers (#22716) (Cameron) - 4ac0fca minifier: Preserve `0 && (module.exports = { ... })` cjs-module-lexer hint (#22729) (Dunqing) - 40ff611 minifier: Mark peephole loop changed when dropping dead-after-throw statement (#22722) (Dunqing) - 2f7b210 codegen: Emit pife-arrow/function leading comments inside the wrap (#22720) (Dunqing) - e184f74 parser: Improve invalid `import` property access diagnostic (#22693) (camc314) - 7baed9c transformer/private-method: Clear inherited strict flags (#22508) (camc314) - a9ad27e parser: Keep annotation comments leading without preceding newline (#22711) (Dunqing) - 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after peephole inlining (#22595) (Dunqing) - 07afbb6 minifier: Drop empty-body IIFE wrapper when called with arguments (#22589) (Dunqing) - fa7c463 semantic: Correct TS enum member symbol spans (#22689) (camc314) - 26b9396 semantic: Resolve parameter decorators outside parameter scope (#22623) (camc314) - b284045 parser: Switch to module goal eagerly on `export` (#22684) (Boshen) - dfa931d semantic: Propagate unresolved auto-increment enum value instead of defaulting to 0 (#22646) (Dunqing) - 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T> in decorator metadata (#22265) (Kyle Cannon) - e421ef0 transformer/legacy-decorator: Return runtime binding for design:type (#22640) (Dunqing) - d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects comments (#22525) (Dunqing) - 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547) (Dunqing) - 917da24 parser: Apply PURE comment through member-access chains (#22566) (Dunqing) - a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings (#22551) (Dunqing) ### ⚡ Performance - 2f623b0 semantic: Skip unresolved checks for re-exports (#22660) (camc314) - 0d9553d semantic: Early-exit `check_object_expression` for objects with <2 properties (#22668) (Dunqing) - d721ad9 semantic: Use direct grandparent lookup for TS type parameters (#22658) (camc314) - 0aff288 semantic: Reorder numeric literal strict mode checks (#22657) (camc314) - 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314) - e32acd8 semantic: Reorder identifier ambient binding check (#22653) (camc314) - 09fe178 semantic: Reorder ident reference strict mode check (#22652) (camc314) - 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663) (camc314) - 82f9662 parser: Check identifier kind before context flag (#22662) (camc314) - d7cd951 parser: Fast path identifier parsing and inline operator helpers (#22650) (Boshen) - 7b84314 semantic: Use direct byte access for numeric leading-zero check (#22642) (camc314) - 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314) - 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596) (Dunqing) - 4f289f1 semantic: Resolve_references_for_current_scope without a temp Vec (#22599) (Dunqing) - e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603) (Dunqing) - 8ff8674 semantic: Early return if `excess` is `0` in `Stats::increase_by` (#22616) (camc314) - 7a4120e semantic: Pre-reserve unresolved_references using Stats::references (#22580) (Dunqing) Co-authored-by: Dunqing <[email protected]>

Summary
When the minifier folds
.concat()calls into template literals viatry_fold_concatinpeephole/replace_known_methods.rs, the original implementation built two short-lived intermediate buffers per call:quasi_strs: Vec<Cow<'a, str>>— freshstd::Vecallocation per call.Stringinside eachCow::Owned— created whenCow::to_mut()cloned aBorrowedinto anOwned, then grown via doubling as more arg strings were pushed in.For files with many
.concat()calls — common in bundled ES5 libraries where template literals weren't available — both structures were freshly allocated thousands of times per minify, with theStringgrowing via doubling each time.This PR replaces those per-call buffers with a single reusable scratch
Stringheld onMinifierState, and constructsTemplateElementAST nodes directly into the arenaVecas args are drained. The intermediateVec<Cow<'a, str>>is gone entirely. After the first few folds the scratch's capacity stabilizes and subsequent folds are amortized zero std-heap allocs.The state machine simplifies in the process: the
pushed_quasi: boolflag is gone. Its invariant ("scratch holds an in-progress quasi") is naturally maintained because every expression flush clears the scratch, so a back-to-back expression's flush produces the required empty separator quasi without a branch.Allocation impact
cargo allocs,allocs_minifier.snap(baseline = parent commit, after resolve-refs and var-hoist landed):antd.js−1,568,−33.7%)−1,569)Only
antd.jsshows a change because it's the only tracked file that exercises.concat()folding heavily. ES5 bundled code uses.concat()instead of template literals; antd.js has thousands of these.The headline
Sys reallocsimprovement (1,622→53) comes from eliminating the per-callStringdoubling. The headlineSys allocsimprovement (4,652→3,084) comes from eliminating the per-callVec<Cow<'a, str>>allocation.Commits
This PR contains two commits, kept separate for review history:
perf(minifier): pre-size buffers in try_fold_concat to eliminate growth reallocs— the original tactical fix that pre-sized the doomed-to-be-removed intermediate buffers. Dropped sys reallocs first but didn't touch sys allocs.perf(minifier): drop per-call buffers in try_fold_concat— the root-cause refactor described above. Removes the intermediate buffers entirely. Cuts both columns.Once the second commit lands, the pre-sizing in the first becomes dead-code (the buffers it pre-sized no longer exist). They're kept as separate commits so the diff against
mainis reviewable — feel free to squash on merge.How I found this
After #22580, profiling on
antd.jsminifier showed it had unusually high sys reallocs vs other tracked files. I wrote a backtrace-capturingSystemallocator wrapper and ran it on the minify path. The top realloc sites all converged ontry_fold_concat— the overwhelming majority of captured reallocs came from this one function. After landing the pre-size fix, code review pushed for a root-cause fix that also addressed theSys allocscolumn, which this refactor delivers.Timing
The minifier bench suite (originally) didn't include antd.js or other
.concat()-heavy fixtures, so bench numbers locally were flat (no regression, no obvious improvement). With kitchen-sink in the bench input set (#22609), CodSpeed shows: minifier mean −1.5%, min −3.7% on kitchen-sink.Verification
cargo test -p oxc_minifier— pass.cargo minsize— no size regressions (minsize.snapbyte-identical).cargo allocs— onlyantd.jsrow changed inallocs_minifier.snap.cargo fmt -p oxc_minifierclean.AI disclosure: drafted with Claude Code, reviewed manually.