-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement LazyBTreeMap and use it in a few places. #50240
Conversation
r? @cramertj (rust_highfive has picked a reviewer for you, use r? to override) |
8735f24
to
40e8c45
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
This is a thin wrapper around BTreeMap that avoids allocating upon creation. It speeds up some rustc-perf benchmarks by up to 3.6%.
40e8c45
to
259ae18
Compare
Made the terrible mistake of clicking on this with my phone; will look at this closer tomorrow morning. Please feel free to ping me mercilessly assuming I have forgotten. I agree this is something that should be looked at seriously, and sympathize with the complexity issues. Could you elaborate on the stability issues with Iter and Range? ABI isn’t part of stability. Have you looked at how Google’s C++ BTree handles emptiness? |
Here's the current range:
Handles don't make sense when there aren't any nodes, so it'll have to change to something like this:
Which is fine, except I don't understand in detail what the
No. I didn't even know such a thing existed :) |
Google Btree, which i believe has the same high-level design (parent pointers), but is also a bit of a huge templated nightmare: https://code.google.com/archive/p/cpp-btree/ re: stability — yeah we only guarantee basic source-stability for stable types/methods. So just that they exist and work. It’s possible to rely on the layout of these types but that’s unexpected and not ensured (though we would consider an exception if it was really rampant). I wonder if we could do some kind of static empty sentinel thing..? |
Skimmed the code a bit more, and yeah it seems like a lot of this can be done by just adding checks for len == 0, or making iterators that have dangling-but-equal pointers (similar to how we do empty slices). |
@gankro and I discussed some more. This is all a bit complicated. The simplest thing is just to proceed with the patch as is, without moving the laziness into BTreeMap. @cramerjt, does that sound reasonable to you? |
( @cramertj ) |
📌 Commit 259ae18 has been approved by |
Implement LazyBTreeMap and use it in a few places. This is a thin wrapper around BTreeMap that avoids allocating upon creation. I would prefer to change BTreeMap directly to make it lazy (like I did with HashSet in #36734) and I initially attempted that by making BTreeMap::root an Option<>. But then I also had to change Iter and Range to handle trees with no root, and those types have stability markers on them and I wasn't sure if that was acceptable. Also, BTreeMap has a lot of complex code and changing it all was challenging, and I didn't have high confidence about my general approach. So I prototyped this wrapper instead and used it in the hottest locations to get some measurements about the effect. The measurements are pretty good! - Doing a debug build of serde, it reduces the total number of heap allocations from 17,728,709 to 13,359,384, a 25% reduction. The number of bytes allocated drops from 7,474,672,966 to 5,482,308,388, a 27% reduction. - It gives speedups of up to 3.6% on some rustc-perf benchmark jobs. crates.io, futures, and serde benefit most. ``` futures-check avg: -1.9% min: -3.6% max: -0.5% serde-check avg: -2.1% min: -3.5% max: -0.7% crates.io-check avg: -1.7% min: -3.5% max: -0.3% serde avg: -2.0% min: -3.0% max: -0.9% serde-opt avg: -1.8% min: -2.9% max: -0.3% futures avg: -1.5% min: -2.8% max: -0.4% tokio-webpush-simple-check avg: -1.1% min: -2.2% max: -0.1% futures-opt avg: -1.2% min: -2.1% max: -0.4% piston-image-check avg: -0.8% min: -1.1% max: -0.3% crates.io avg: -0.6% min: -1.0% max: -0.3% ``` @gankro, how do you think I should proceed here? Is leaving this as a wrapper reasonable? Or should I try to make BTreeMap itself lazy? If so, can I change the representation of Iter and Range? Thanks!
☀️ Test successful - status-appveyor, status-travis |
It was introduced in rust-lang#50240 to avoid an allocation when creating a new BTreeMap, which gave some speed-ups. But then rust-lang#50352 made that the default behaviour for BTreeMap, so LazyBTreeMap is no longer necessary.
…ertj Remove LazyBTreeMap. It was introduced in rust-lang#50240 to avoid an allocation when creating a new BTreeMap, which gave some speed-ups. But then rust-lang#50352 made that the default behaviour for BTreeMap, so LazyBTreeMap is no longer necessary.
…ertj Remove LazyBTreeMap. It was introduced in rust-lang#50240 to avoid an allocation when creating a new BTreeMap, which gave some speed-ups. But then rust-lang#50352 made that the default behaviour for BTreeMap, so LazyBTreeMap is no longer necessary.
This is a thin wrapper around BTreeMap that avoids allocating upon creation.
I would prefer to change BTreeMap directly to make it lazy (like I did with HashSet in #36734) and I initially attempted that by making BTreeMap::root an Option<>. But then I also had to change Iter and Range to handle trees with no root, and those types have stability markers on them and I wasn't sure if that was acceptable. Also, BTreeMap has a lot of complex code and changing it all was challenging, and I didn't have high confidence about my general approach.
So I prototyped this wrapper instead and used it in the hottest locations to get some measurements about the effect. The measurements are pretty good!
Doing a debug build of serde, it reduces the total number of heap allocations from 17,728,709 to 13,359,384, a 25% reduction. The number of bytes allocated drops from 7,474,672,966 to 5,482,308,388, a 27% reduction.
It gives speedups of up to 3.6% on some rustc-perf benchmark jobs. crates.io, futures, and serde benefit most.
@gankro, how do you think I should proceed here? Is leaving this as a wrapper reasonable? Or should I try to make BTreeMap itself lazy? If so, can I change the representation of Iter and Range?
Thanks!