Iterate backwards on zdiff/zinter/zunion to optimize for zslInsert#8105
Conversation
In the iterator for these functions, we'll traverse the sorted sets in a reversed way so that largest elements come first. We prefer this order because it's optimized for insertion in a skiplist, which is the destination of the elements being iterated in there functions.
|
My initial thought was that this iterator (zuiInitIterator) should take a boolean input indicating if the iteration should be forward or backward, and support both, but if it's only gonna be used in one direction, i guess there's no need for that. Regarding the name, interesting observation about the "ui" prefix, but in that case we should have changed it to "uid" (union, intersection, diff), but IMHO that's lame. I would like to ask that you conduct a simple benchmark on big objects to show that this change really improves performance (or at least doesn't hurt it). i.e. run ZUIONSTORE, ZINTERSTORE and ZDIFFSTORE (both algorithms) on large sets with the old and new code and measure the time (maybe using CONFIG RESETSTAT and INFO COMMANDSTATS). maybe do that once on skiplists, and once for ziplist (after setting zset-max-ziplist-entries to be very high). IIUC in ZUNION, there's no benefit of doing backwards or forward, maybe we'll notice that forward if more cache friendly. |
|
Heya @felipou and thanks for this. Would it possible to back up the "We prefer this order..." claim with a simple before/after benchmark? |
@itamarhaber Yes, will do, I'll follow @oranagra suggestions! |
I started thinking something along those lines, I was going to implement a zuiInitReverseIterator and zuiPrev, but when I noticed that we could use the same direction in all cases, so I decided it was simpler to just always reverse the order.
Ok, I'll leave them as is (good idea for the new name though).
Ok, I'll try to run some benchmarks today, thanks for the tips! |
|
I create a simple script for an initial benchmark with Python: https://gist.github.com/felipou/50a697856a24e1910f332648f34ba94c There are the results so far: Which shows a 3.33% reduction in time for zdiff, and 1.25% reduction in time for zunion. I expected better results, but upon better inspecting the zslInsert code, I've come to think that the gain in performance for inserting this way isn't that great compared to the constant times of the algorithm. But anyway, if you think this seems worth it, I can do some improved benchmarks (get the standard deviation to check if there really was a significant improvement, for example), and also create a benchmark for zinter. |
Nicely done, thank you! |
|
i also expected bigger impact, maybe if you'll use bigger objects the impact will be greater. (now each call takes just 2ms, maybe scale by 100 so that each call takes 200ms). since it's not a big effort and you're almost there, i'd rather completing the effort (i'm paranoid that in some of the cases maybe we're causing a damage and we won't realize it). i do think we want to cover all cases (i.e. including ziplists, and both zdiff algorithms)... assuming the effort isn't big. |
It's actually around 2s. At list that's what I observed, and I assumed "usec" was microseconds, so 10^-6 seconds, giving 2.12 seconds from the result of 2123177.75 usec per call. Previously I had tested with 10 million elements, and it took around 20 seconds.
No problem, I don't think the effort will be too big. I'll come up with a zinter benchmark, then try to do some optimization trick as you suggested to avoid the slow process of zadds. Then I'll just prepare the code to run everything and that's it. Should be finished by tomorrow. |
|
Just updated the benchmark codes, here are the results for the skiplist (the percentages are the reduction in average time): Raw results: Now I'll benchmark with ziplists (setting |
|
is there a reason to benchmark both zdiff and zdiffstore? shouldn't they give similar results? (8% and 6% are similar in my eyes 8-). |
|
FYI @filipecosta90 |
Actually no, I thought someone had asked for it, but now I see that I was mistaken. Anyway, I just finished the benchmarks forcing ziplists with size 50k (it was already slow enough to build the zsets with this size): And the raw results: If I'm not mistaken, I believe the zslInserts will always be performed in a skiplist, I think that's how the target zset is initialized on these algorithms. Since we're using much smaller zsets (the previous benchmark I ran with zsets with 1 million elements, and now just 50k), I think this is why it didn't show a larger performance gain. I'll try to benchmark both zdiff algorithms now using a skiplist. |
|
The results (reduction in average time): |
|
well, the improvement isn't huge, but it's certainly there, and there's probably no regression ( if we went that distance, let's make sure of that before we close this campaign. |
|
In the case of large (100k) ziplists, it seems the zunion is usually a little bit slower. It's hard to say for sure without using deeper statistics. I ran the zunion benchmark multiple times, and then ran again for skiplists to check again: I optimized the benchmark code to dump the binary values and load from a file, now it's a lot faster because we don't need to build the zsets every time, which was the slowest part of the benchmark. So now I can try anything it else very fast, and using the same data structures as input. |
|
from the numbers you posted, it looks like the average regression for ZUNION is about 1% for ziplists. considering ziplists are not expected to grow that big (100k entries), i think we should ignore this. |
…edis#8105) In the iterator for these functions, we'll traverse the sorted sets in a reversed way so that largest elements come first. We prefer this order because it's optimized for insertion in a skiplist, which is the destination of the elements being iterated in there functions.
In the iterator for these functions, we'll traverse the sorted sets
in a reversed way so that largest elements come first. We prefer
this order because it's optimized for insertion in a skiplist, which
is the destination of the elements being iterated in these functions.
I didn't modify the iteration code for sets (since it isn't ordered anyway),
and I also kept the function names (didn't rename to "reverse" or something),
just added a comment to zuiInitIterator. I feel like this should be further
documented, but I wasn't sure where to put further explanations.
I just noticed that the "ui" in "zuiInitIterator" is probably "union" and "inter".
Maybe we should rename all "zui" functions to "zuid"?