-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The problem
SemanticBuilder and Transformer both create a large number of temporary data structures which are discarded at the end of the pass.
There are 3 problems with this:
- It's expensive calling into system allocator to allocate these temp structures.
- All these temp structures need to be dropped individually, which is also expensive.
- All this allocation/deallocation introduces variance into our benchmarks. This makes it harder to do other perf optimization work, as our measurements are somewhat unreliable.
Proposed solution
I propose we introduce "scratch space" for storing temporary data in. This would be a separate arena which is filled up with data during the pass, and dropped all in one go at the end. This would solve all 3 of these problems.
Call it ScratchAllocator.
It's also a good alternative to #89, which we have no solution for at present.
Prior art: serde appears to do this.
Implementation
We can reuse our existing infrastructure from oxc_allocator:
ScratchAllocatorwould be a bumpaloBump.- Temp structures use
oxc_allocator'sVecandBox. - hashbrown's
HashMapandHashSetsupport custom allocators in stable Rust. We don't have to implement our own hash map (thankfully!).
3 options for integrating this into our code:
Option 1: Create scratch arena at start of semantic/transformer pass
Create a ScratchAllocator at start of each Transformer pass.
I think we can align lifetimes so the &'a ScratchAllocator uses same lifetime as main &'a Allocator. So we don't need to introduce another lifetime.
Option 2: Reuse a single ScratchAllocator
Create a single ScratchAllocator and pass it in to Transformer::build and SemanticBuilder::build.
Advantages:
- We re-use 1
ScratchAllocatorover and over, so no overhead of allocating space for it on eachTransformer::buildcall. - The memory backing
ScratchAllocatorwill remain warm in CPU cache.
Disadvantages:
- More complex public API. User has to create both an
Allocatorand aScratchAllocator, and pass both of them around.
Option 3: Store scratch allocator inside Allocator
Same as option (2), but Allocator contains 2 arenas - main, and scratch.
struct Allocator {
ast: bumpalo::Bump,
scratch: bumpalo::Bump,
}Advantage: No changes to public API. User only creates one Allocator, same as now.
Disadvantage: Rolldown creates a separate Allocator per AST. Combining main arena and scratch arena into 1 structure prevents reusing the same ScratchAllocator for processing multiple ASTs.
Maybe there are additional APIs we can add for advanced use cases like Rolldown's, which allow taking the scratch arena out of an Allocator and creating a new Allocator that re-uses it - re-cycle a single scratch arena, while keeping the standard API simple.
Which?
I think probably option (3) is the best starting point.
Later on
Sometimes, a bump allocator is not ideal for our uses.
Often temp structures are created and destroyed on entering/leaving an AST node or scope. e.g. UnresolvedReferences in SemanticBuilder works like this. A stack-like allocator would be more appropriate for these cases.
In other places, temp structures are created and destroyed in non-sequential order. An allocator with a simple free list would be better for memory consumption. Maybe something like talc would be good.
We could use a bump allocator initially, and later on optimize further by using more appropriate allocators for these cases.
In meantime, we can use other mechanisms to recycle structures within the scratch arena, the way UnresolvedReferencesStack does. That may be sufficient for our needs, without changing the scratch allocator itself.