perf(allocator/vec2): optimize reserving memory#9792
Merged
graphite-app[bot] merged 1 commit intomainfrom Mar 17, 2025
Merged
Conversation
Member
Author
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
CodSpeed Performance ReportMerging #9792 will not alter performanceComparing Summary
|
e04b012 to
e55b350
Compare
6634ef5 to
02413cd
Compare
20a2e94 to
01c3780
Compare
Member
Merge activity
|
resolve: #9656 (comment) #9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting `reserve_internal` to `reserve_exact_internal` and `reserve_amortized_internal` respectively the internal implementation of `reserve_exact` and `reserve`. Why the change can improve performance? The original `reserve_internal` implementation has a check for reserve strategy, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L664-L668 which is can be avoided because the caller of `reserve_internal` already knows the reserve strategy. After the change, the `reserve_exact` and `reserve` can call the corresponding internal implementation directly, which can avoid unnecessary checks. Likewise, the `Fallibility` check can also be avoided, https://github.com/oxc-project/oxc/blob/fef680a4775559805e99622fb5aa6155cdf47034/crates/oxc_allocator/src/vec2/raw_vec.rs#L681-L683 because we know where the errors should be handled. ~~Due to this change, I also replaced Bumpalo's `CollecitonAllocErr` with allocator-api2's `TryReserveError` because `CollecitonAllocErr::AllocErr` cannot pass in a `Layout`.~~ I ended up reverting 937c61a as it caused transformer performance 1%-2% regression (See [codspeed](https://codspeed.io/oxc-project/oxc/branches/03-15-pref_allocator_vec2_optimize_reserving_memory) and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by 84edacd I've tried various way to save the performance but it not work. I suspect the cause is that `TryReserveError` is 16 bytes whereas `CollecitonAllocErr` is only 1 byte. So, after both checks are removed, the performance returns to the original. The whole change is according to standard `RawVec`'s implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html <img width="608" alt="image" src="https://github.com/user-attachments/assets/53066d8e-26f0-4eb1-8f33-4ca9e517e75b" />
84edacd to
17a9320
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

resolve: #9656 (comment)
#9656 brought a small performance improvement for the transformer but also led the parser to 1% performance hits. This PR returns performance by splitting
reserve_internaltoreserve_exact_internalandreserve_amortized_internalrespectively the internal implementation ofreserve_exactandreserve.Why the change can improve performance?
The original
reserve_internalimplementation has a check for reserve strategy,oxc/crates/oxc_allocator/src/vec2/raw_vec.rs
Lines 664 to 668 in fef680a
reserve_internalalready knows the reserve strategy. After the change, thereserve_exactandreservecan call the corresponding internal implementation directly, which can avoid unnecessary checks.Likewise, the
Fallibilitycheck can also be avoided,oxc/crates/oxc_allocator/src/vec2/raw_vec.rs
Lines 681 to 683 in fef680a
because we know where the errors should be handled.
Due to this change, I also replaced Bumpalo'sI ended up reverting 937c61a as it caused transformer performance 1%-2% regression (See codspeed and switch to "replace CollectionAllocErr with TryReserveError" commit), and replaced by 84edacd I've tried various way to save the performance but it not work. I suspect the cause is thatCollecitonAllocErrwith allocator-api2'sTryReserveErrorbecauseCollecitonAllocErr::AllocErrcannot pass in aLayout.TryReserveErroris 16 bytes whereasCollecitonAllocErris only 1 byte.So, after both checks are removed, the performance returns to the original. The whole change is according to standard
RawVec's implementation. See https://doc.rust-lang.org/src/alloc/raw_vec.rs.html