Buddy allocation for large buffers in adaptive allocator (#16053)#16132
Merged
Conversation
Motivation: The histogram bump allocating chunks have poor chunk and memory reuse in practice, which leads to higher memory usage than the pooled allocator whenever an application performs enough allocations of buffers that don't fit in our size-classed chunks. Buddy allocation should in theory reduce memory consumption by allowing memory reuse within a chunk, similar to the size-classed chunks, but for variable power-of-two sized allocations. We've found that beyond 16k-20k buffer sizes, allocations predominantly comes in power-of-two sizes, hence buddy allocation should be a good fit for this size regime. Modification: * Implement a new chunk type that does buddy allocation, based on an array-embedded binary search tree. * The tree is encoded as a dense byte array, with two bits marking node or child-node usage, and six bits to encode the node size. * The histogram pointer-bump allocating chunk implementation is removed, which unlocks potential simplifications and optimizations that will benefit both buddy and size-classed chunks. * The 32k and 64k size classes are kept for the time being, to keep chunk churn under control, but they are planned to be removed in a follow-up PR. Result: We generally get improvements to memory usage, because the buddy allocator is able to reuse its chunks before they are fully deallocated. If the 32k and 64k size-classes are removed, then the improvements continue to hold up, but we see an increase in allocation churn for buddy chunks. This needs to be investigated and solved before we can remove the 32k and 64k size-classes. Presumably, it comes down to making better decisions about the size of the buddy chunks, and in picking which chunks to allocate from next once a magazine has exhausted its current chunk. (cherry picked from commit 2dbc1e7)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation:
The histogram bump allocating chunks have poor chunk and memory reuse in practice, which leads to higher memory usage than the pooled allocator whenever an application performs enough allocations of buffers that don't fit in our size-classed chunks.
Buddy allocation should in theory reduce memory consumption by allowing memory reuse within a chunk, similar to the size-classed chunks, but for variable power-of-two sized allocations.
We've found that beyond 16k-20k buffer sizes, allocations predominantly comes in power-of-two sizes, hence buddy allocation should be a good fit for this size regime.
Modification:
Result:
We generally get improvements to memory usage, because the buddy allocator is able to reuse its chunks before they are fully deallocated. If the 32k and 64k size-classes are removed, then the improvements continue to hold up, but we see an increase in allocation churn for buddy chunks.
This needs to be investigated and solved before we can remove the 32k and 64k size-classes.
Presumably, it comes down to making better decisions about the size of the buddy chunks, and in picking which chunks to allocate from next once a magazine has exhausted its current chunk.
(cherry picked from commit 2dbc1e7)