Optimize reading data from postings/impacts enums. #12664

jpountz · 2023-10-12T12:34:56Z

This implements the following optimizations:

ImpactsEnum#advance() and PostingsEnum#advance() now need a single comparison instead of two when the target has already been decoded in the current block.
ImpactsEnum#nextDoc() now gets a dedicated implementation instead of delegating to advance(). Initial support for dynamic pruning via WAND always used advance() in Lucene, but we are now using nextDoc() since we switched to MAXSCORE.
ImpactsEnum#nextDoc() and ImpactsEnum#advance() now decode frequencies lazily. The code was already assuming frequencies would get decoded lazily, it was just missing a small change to actually work.

This implements the following optimizations: - `ImpactsEnum#advance()` and `PostingsEnum#advance()` now need a single comparison instead of two when the target has already been decoded in the current block. - `ImpactsEnum#nextDoc()` now gets a dedicated implementation instead of delegating to `advance()`. Initial support for dynamic pruning via WAND always used `advance()` in Lucene, but we are now using `nextDoc()` since we switched to `MAXSCORE`. - `ImpactsEnum#nextDoc()` and `ImpactsEnum#advance()` now decode frequencies lazily. The code was already assuming frequencies would get decoded lazily, it was just missing a small change to actually work.

jpountz · 2023-10-12T12:38:46Z

luceneutil on wikibigall gave good results, better than I expected:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           HighTermDayOfYearSort      270.41      (2.0%)      255.63      (2.3%)   -5.5% (  -9% -   -1%) 0.000
                 CountOrHighHigh       57.44     (16.1%)       54.37     (11.0%)   -5.4% ( -27% -   25%) 0.220
                  CountOrHighMed       89.17     (15.9%)       84.60     (11.1%)   -5.1% ( -27% -   25%) 0.237
                 CountAndHighMed      122.57      (3.4%)      120.69      (4.4%)   -1.5% (  -8% -    6%) 0.212
                         Prefix3      105.15      (5.9%)      104.66      (5.4%)   -0.5% ( -11% -   11%) 0.795
                        Wildcard      112.31      (3.2%)      111.94      (3.4%)   -0.3% (  -6% -    6%) 0.758
                     CountPhrase        4.41      (2.2%)        4.40      (4.2%)   -0.3% (  -6% -    6%) 0.776
                          IntNRQ      164.43     (13.4%)      163.96     (12.7%)   -0.3% ( -23% -   29%) 0.945
                         Respell       73.48      (1.9%)       73.51      (1.7%)    0.0% (  -3% -    3%) 0.944
                       CountTerm    17115.51      (4.7%)    17122.81      (6.5%)    0.0% ( -10% -   11%) 0.981
                          Fuzzy2      126.72      (1.2%)      127.18      (1.4%)    0.4% (  -2% -    2%) 0.380
                        PKLookup      224.32      (1.8%)      225.19      (2.1%)    0.4% (  -3% -    4%) 0.532
                          Fuzzy1      148.99      (1.3%)      149.61      (1.5%)    0.4% (  -2% -    3%) 0.345
                       LowPhrase       22.22      (3.9%)       22.38      (3.6%)    0.7% (  -6% -    8%) 0.529
                         LowTerm     1050.65      (6.0%)     1058.66      (4.2%)    0.8% (  -8% -   11%) 0.641
                       MedPhrase       69.20      (4.0%)       69.91      (3.3%)    1.0% (  -6% -    8%) 0.377
                         MedTerm      611.39      (7.0%)      618.72      (4.8%)    1.2% (  -9% -   13%) 0.525
               HighTermMonthSort     5173.19      (2.6%)     5241.52      (2.2%)    1.3% (  -3% -    6%) 0.084
                        HighTerm      425.31      (7.9%)      431.22      (6.0%)    1.4% ( -11% -   16%) 0.532
                      HighPhrase       45.70      (5.1%)       46.39      (4.3%)    1.5% (  -7% -   11%) 0.311
                CountAndHighHigh       40.59      (4.0%)       41.33      (5.1%)    1.8% (  -6% -   11%) 0.208
                       OrHighLow      554.87      (3.7%)      579.95      (3.8%)    4.5% (  -2% -   12%) 0.000
                      OrHighHigh       47.17      (4.9%)       49.35      (5.1%)    4.6% (  -5% -   15%) 0.003
                     AndHighHigh       69.41      (4.1%)       73.39      (3.9%)    5.7% (  -2% -   14%) 0.000
                       OrHighMed      244.77      (3.4%)      261.52      (3.3%)    6.8% (   0% -   13%) 0.000
                      AndHighMed      124.85      (3.4%)      134.47      (3.4%)    7.7% (   0% -   15%) 0.000
                      AndHighLow     1059.79      (2.7%)     1152.15      (2.2%)    8.7% (   3% -   14%) 0.000

All OrXY and AndYY tasks show a good speedup with a p-value equal to 0. I'm not sure about queries that show a regression, maybe they didn't like that the implementation of BlockDocsEnum#advance() is no longer forcefully inlined.

jpountz · 2023-10-12T12:48:01Z

lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90PostingsReader.java

        pforUtil.decodeAndPrefixSum(docIn, accum, docBuffer);
        if (indexHasFreqs) {
-          pforUtil.decode(docIn, freqBuffer);
+          isFreqsRead = false;


This was the change needed to decode freqs lazily.

jpountz · 2023-10-12T12:52:46Z

lucene/core/src/java/org/apache/lucene/codecs/MultiLevelSkipListReader.java


    while (level >= 0) {
-      if (target > skipDoc[level]) {
+      if (target > skipDoc[level] || skipDoc[level] == 0) {


It's a bit annoying I had to change skip lists, but it was required so that skip lists and postings blocks remain aligned. Otherwise advanceShallow(0) would set nextSkipDoc equal to 0 instead of the last doc ID of the first block.

jpountz · 2023-10-12T13:08:39Z

I tried to re-inline advance() but it still suggested a small slowdown for HighTermDayOfYearSort and CountAndHighMed too so I suspect that there is something else. There is indeed one more check when one needs to move to a different block, but I would expect the cost to be dominated by the decoding of a full block in that case so that it shouldn't make a big difference.

The slowdown on CountOrHighHigh and CountOrHighMed reported above should be either noise or due to the JVM compiling things differently because of other changes, given that PostingsEnum#nextDoc() wasn't changed.

I'm tempted to not think too hard about HighTermDayOfYearSort and CountAndHighMed since there are more queries that get faster than slower with this change. While it's not big, I would expect the speedup on CountAndHighHigh to be real since it generally needs to advance postings by small increments, which is what the change to PostingsEnum#advance is supposed to help with.

jpountz · 2023-10-12T13:14:31Z

Another potential reason for the slowdowns is that they're caused by the changes to MultiLevelSkipListReader. Removing the additional checks would require changing the way we encode postings and skip lists to start from -1 instead of 0, ie. changes on the write side too. (This feels like it would be a good change in general anyway, e.g. if you have a term that is present in every document today, we don't enable the "all deltas are the same" optimization for the first block of postings because the first delta in the block is actually 0 and not 1.)

jpountz · 2023-10-13T06:57:45Z

I'll split this PR into smaller ones, so that it's easier to understand the impact of each change.

jpountz added 2 commits October 12, 2023 14:15

CHANGE

b247d60

jpountz commented Oct 12, 2023

View reviewed changes

Tidy.

2912a50

jpountz closed this Oct 13, 2023

jpountz mentioned this pull request Oct 13, 2023

Fix lazy decoding of frequencies in BlockImpactsDocsEnum. #12668

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize reading data from postings/impacts enums. #12664

Optimize reading data from postings/impacts enums. #12664

Uh oh!

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 12, 2023 •

edited

Loading

Uh oh!

jpountz Oct 12, 2023

Uh oh!

jpountz Oct 12, 2023

Uh oh!

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Optimize reading data from postings/impacts enums. #12664

Optimize reading data from postings/impacts enums. #12664

Uh oh!

Conversation

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz Oct 12, 2023

Choose a reason for hiding this comment

Uh oh!

jpountz Oct 12, 2023

Choose a reason for hiding this comment

Uh oh!

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 12, 2023

Uh oh!

jpountz commented Oct 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jpountz commented Oct 12, 2023 •

edited

Loading