perf(allocator/vec2): align min amortized cap size with `std` by Dunqing · Pull Request #9857 · oxc-project/oxc

Dunqing · 2025-03-18T08:22:36Z

Align with https://doc.rust-lang.org/src/alloc/raw_vec.rs.html#653-656, but unfortunately, we got a performance regression from this optimization, which was caused by let cap = cmp::max(Self::MIN_NON_ZERO_CAP, cap);. So I commented it out and added comments to describe why. Although the performance has not changed, keep the implementation the same as the standard library is also nice to have.

Dunqing · 2025-03-18T08:23:05Z

perf(allocator/vec2): align min amortized cap size with std #9857 👈 (View in Graphite)
feat(allocator/vec2): align RawVec::reserve with standard library implementation #10701
perf(allocator/vec2): replace self.reserve(1) calls with self.grow_one() for better efficiency #9856
feat(allocator/vec2): add specialized grow_one method #9855
perf(allocator/vec2): calling Bump::grow or Bump::shrink at the call site directly instead of calling realloc #10686
perf(allocator/vec2): resolve performance regression for extend by marking reserve as #[cold] and #[inline(never)] #10675
feat(allocator/vec2): introduce extend_desugared method as extend internal implementation #10670
perf(transformer/refresh): use take/take_in instead of drain #10656
perf(transformer): optimize inserting var/let statements #10654
main

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

codspeed-hq · 2025-03-18T08:29:14Z

CodSpeed Instrumentation Performance Report

Merging #9857 will not alter performance

_{Comparing 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std (4eaef66) with main (7f2f247)}

Summary

✅ 36 untouched benchmarks

overlookmotel

@Dunqing This is a really interesting discovery that Vecs having a minimum capacity of 4 is a perf regression for oxc_transformer etc (it'd be 4 for all AST types, as they're all between 8 and 1024 bytes).

I suspect that the reason isn't the cost of the cmp::max call in grow_amortized (like your comment says), but is to do with CPU cache usage.

As you say, many Vecs in AST require less than 4 elements e.g. the Vec<FormalParameter> in function foo(x) {}. FormalParameter is 80 bytes, so if there's spare capacity of 3 in the Vec, that results in 240 bytes unused in the middle of the arena. With data spread out across the arena, it'll produce more L1/L2 cache misses when traversing the AST.

That's just a theory, but I doubt one cmp::max call could produce a significant perf regression, especially as it's on a cold path.

overlookmotel · 2025-05-03T14:33:43Z

Merge activity

May 3, 10:33 AM EDT: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
May 3, 10:52 AM EDT: overlookmotel added this pull request to the Graphite merge queue.
May 3, 11:11 AM EDT: Merged by the Graphite merge queue.

Align with https://doc.rust-lang.org/src/alloc/raw_vec.rs.html#653-656, but unfortunately, we got a performance regression from this optimization, which was caused by `let cap = cmp::max(Self::MIN_NON_ZERO_CAP, cap);`. So I commented it out and added comments to describe why. Although the performance has not changed, keep the implementation the same as the standard library is also nice to have.

Dunqing · 2025-05-16T08:11:41Z

@Dunqing This is a really interesting discovery that Vecs having a minimum capacity of 4 is a perf regression for oxc_transformer etc (it'd be 4 for all AST types, as they're all between 8 and 1024 bytes).

I suspect that the reason isn't the cost of the cmp::max call in grow_amortized (like your comment says), but is to do with CPU cache usage.

As you say, many Vecs in AST require less than 4 elements e.g. the Vec<FormalParameter> in function foo(x) {}. FormalParameter is 80 bytes, so if there's spare capacity of 3 in the Vec, that results in 240 bytes unused in the middle of the arena. With data spread out across the arena, it'll produce more L1/L2 cache misses when traversing the AST.

That's just a theory, but I doubt one cmp::max call could produce a significant perf regression, especially as it's on a cold path.

I agreed your point! I've tried to pre-allocate 4 or 8 capacity for most of Vec usages in the parser, the benchmark result surprised me, it doesn't improve but hurts 1%-2% performance. The reason is probably like you said.

github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label Mar 18, 2025

Dunqing mentioned this pull request Jan 18, 2026

Improve Vec2 oxc-project/backlog#199

Open

12 tasks

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 800ce69 to eafa7ae Compare March 18, 2025 08:36

Dunqing force-pushed the 03-18-perf_allocator_vec2_replace_self.reserve_1_calls_with_self.grow_one_for_better_efficiency branch from 4dc483f to 9d1db26 Compare March 18, 2025 08:36

Dunqing mentioned this pull request Mar 18, 2025

refactor(allocator/vec2): rename parameters and method name to align with std #9858

Merged

Dunqing force-pushed the 03-18-perf_allocator_vec2_replace_self.reserve_1_calls_with_self.grow_one_for_better_efficiency branch from 9d1db26 to c94a0f0 Compare March 18, 2025 08:59

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from eafa7ae to 00f7adb Compare March 18, 2025 08:59

Dunqing changed the base branch from 03-18-perf_allocator_vec2_replace_self.reserve_1_calls_with_self.grow_one_for_better_efficiency to graphite-base/9857 March 18, 2025 10:36

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 00f7adb to d354fa1 Compare March 18, 2025 10:58

Dunqing force-pushed the graphite-base/9857 branch 2 times, most recently from 07ac0f1 to 43613e1 Compare March 18, 2025 16:35

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from d354fa1 to 66bd195 Compare March 18, 2025 16:35

Dunqing changed the base branch from graphite-base/9857 to 03-18-perf_allocator_vec2_replace_self.reserve_1_calls_with_self.grow_one_for_better_efficiency March 18, 2025 16:35

Dunqing marked this pull request as draft March 20, 2025 09:48

Dunqing changed the base branch from 03-18-perf_allocator_vec2_replace_self.reserve_1_calls_with_self.grow_one_for_better_efficiency to graphite-base/9857 March 21, 2025 08:44

Dunqing force-pushed the graphite-base/9857 branch from 43613e1 to e8c0a03 Compare March 21, 2025 08:54

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 66bd195 to 01cc973 Compare March 21, 2025 08:54

Dunqing changed the base branch from graphite-base/9857 to 04-29-feat_allocator_vec2_align_rawvec_reserve_with_standard_library_implementation April 30, 2025 07:31

Dunqing force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch 3 times, most recently from 8ee03a4 to 9927f60 Compare April 30, 2025 11:36

Dunqing marked this pull request as ready for review April 30, 2025 11:41

graphite-app bot force-pushed the 04-29-feat_allocator_vec2_align_rawvec_reserve_with_standard_library_implementation branch from df1271b to b3bd1f4 Compare May 3, 2025 13:12

graphite-app bot force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 9927f60 to 7d5ef8e Compare May 3, 2025 13:13

overlookmotel approved these changes May 3, 2025

View reviewed changes

overlookmotel added the 0-merge Merge with Graphite Merge Queue label May 3, 2025

graphite-app bot force-pushed the 04-29-feat_allocator_vec2_align_rawvec_reserve_with_standard_library_implementation branch from b3bd1f4 to b34bd06 Compare May 3, 2025 14:54

graphite-app bot force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 7d5ef8e to 84407cc Compare May 3, 2025 14:55

graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 3, 2025

overlookmotel force-pushed the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch from 84407cc to 4eaef66 Compare May 3, 2025 15:04

overlookmotel force-pushed the 04-29-feat_allocator_vec2_align_rawvec_reserve_with_standard_library_implementation branch from b34bd06 to 3cd3d23 Compare May 3, 2025 15:04

overlookmotel added the 0-merge Merge with Graphite Merge Queue label May 3, 2025

graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 3, 2025

Base automatically changed from 04-29-feat_allocator_vec2_align_rawvec_reserve_with_standard_library_implementation to main May 3, 2025 15:10

graphite-app bot merged commit 4eaef66 into main May 3, 2025
29 checks passed

graphite-app bot deleted the 03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std branch May 3, 2025 15:11

oxc-bot mentioned this pull request May 3, 2025

release(crates): v0.68.0 #10777

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(allocator/vec2): align min amortized cap size with `std`#9857

perf(allocator/vec2): align min amortized cap size with `std`#9857
graphite-app[bot] merged 1 commit intomainfrom
03-18-perf_allocator_vec2_align_min_amortized_cap_size_with_std

Dunqing commented Mar 18, 2025 •

edited

Loading

Uh oh!

Dunqing commented Mar 18, 2025 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 18, 2025 •

edited

Loading

Uh oh!

overlookmotel left a comment •

edited

Loading

Uh oh!

overlookmotel commented May 3, 2025 •

edited by graphite-app bot

Loading

Uh oh!

Uh oh!

Dunqing commented May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Dunqing commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dunqing commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Instrumentation Performance Report

Merging #9857 will not alter performance

Summary

Uh oh!

overlookmotel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

overlookmotel commented May 3, 2025 • edited by graphite-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Dunqing commented May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dunqing commented Mar 18, 2025 •

edited

Loading

Dunqing commented Mar 18, 2025 •

edited

Loading

codspeed-hq bot commented Mar 18, 2025 •

edited

Loading

overlookmotel left a comment •

edited

Loading

overlookmotel commented May 3, 2025 •

edited by graphite-app bot

Loading