Skip to content

Conversation

@jpountz
Copy link
Contributor

@jpountz jpountz commented Oct 12, 2023

This implements the following optimizations:

  • ImpactsEnum#advance() and PostingsEnum#advance() now need a single comparison instead of two when the target has already been decoded in the current block.
  • ImpactsEnum#nextDoc() now gets a dedicated implementation instead of delegating to advance(). Initial support for dynamic pruning via WAND always used advance() in Lucene, but we are now using nextDoc() since we switched to MAXSCORE.
  • ImpactsEnum#nextDoc() and ImpactsEnum#advance() now decode frequencies lazily. The code was already assuming frequencies would get decoded lazily, it was just missing a small change to actually work.

This implements the following optimizations:
 - `ImpactsEnum#advance()` and `PostingsEnum#advance()` now need a single
   comparison instead of two when the target has already been decoded in the
   current block.
 - `ImpactsEnum#nextDoc()` now gets a dedicated implementation instead of
   delegating to `advance()`. Initial support for dynamic pruning via WAND
   always used `advance()` in Lucene, but we are now using `nextDoc()` since we
   switched to `MAXSCORE`.
 - `ImpactsEnum#nextDoc()` and `ImpactsEnum#advance()` now decode frequencies
   lazily. The code was already assuming frequencies would get decoded lazily,
   it was just missing a small change to actually work.
@jpountz
Copy link
Contributor Author

jpountz commented Oct 12, 2023

luceneutil on wikibigall gave good results, better than I expected:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           HighTermDayOfYearSort      270.41      (2.0%)      255.63      (2.3%)   -5.5% (  -9% -   -1%) 0.000
                 CountOrHighHigh       57.44     (16.1%)       54.37     (11.0%)   -5.4% ( -27% -   25%) 0.220
                  CountOrHighMed       89.17     (15.9%)       84.60     (11.1%)   -5.1% ( -27% -   25%) 0.237
                 CountAndHighMed      122.57      (3.4%)      120.69      (4.4%)   -1.5% (  -8% -    6%) 0.212
                         Prefix3      105.15      (5.9%)      104.66      (5.4%)   -0.5% ( -11% -   11%) 0.795
                        Wildcard      112.31      (3.2%)      111.94      (3.4%)   -0.3% (  -6% -    6%) 0.758
                     CountPhrase        4.41      (2.2%)        4.40      (4.2%)   -0.3% (  -6% -    6%) 0.776
                          IntNRQ      164.43     (13.4%)      163.96     (12.7%)   -0.3% ( -23% -   29%) 0.945
                         Respell       73.48      (1.9%)       73.51      (1.7%)    0.0% (  -3% -    3%) 0.944
                       CountTerm    17115.51      (4.7%)    17122.81      (6.5%)    0.0% ( -10% -   11%) 0.981
                          Fuzzy2      126.72      (1.2%)      127.18      (1.4%)    0.4% (  -2% -    2%) 0.380
                        PKLookup      224.32      (1.8%)      225.19      (2.1%)    0.4% (  -3% -    4%) 0.532
                          Fuzzy1      148.99      (1.3%)      149.61      (1.5%)    0.4% (  -2% -    3%) 0.345
                       LowPhrase       22.22      (3.9%)       22.38      (3.6%)    0.7% (  -6% -    8%) 0.529
                         LowTerm     1050.65      (6.0%)     1058.66      (4.2%)    0.8% (  -8% -   11%) 0.641
                       MedPhrase       69.20      (4.0%)       69.91      (3.3%)    1.0% (  -6% -    8%) 0.377
                         MedTerm      611.39      (7.0%)      618.72      (4.8%)    1.2% (  -9% -   13%) 0.525
               HighTermMonthSort     5173.19      (2.6%)     5241.52      (2.2%)    1.3% (  -3% -    6%) 0.084
                        HighTerm      425.31      (7.9%)      431.22      (6.0%)    1.4% ( -11% -   16%) 0.532
                      HighPhrase       45.70      (5.1%)       46.39      (4.3%)    1.5% (  -7% -   11%) 0.311
                CountAndHighHigh       40.59      (4.0%)       41.33      (5.1%)    1.8% (  -6% -   11%) 0.208
                       OrHighLow      554.87      (3.7%)      579.95      (3.8%)    4.5% (  -2% -   12%) 0.000
                      OrHighHigh       47.17      (4.9%)       49.35      (5.1%)    4.6% (  -5% -   15%) 0.003
                     AndHighHigh       69.41      (4.1%)       73.39      (3.9%)    5.7% (  -2% -   14%) 0.000
                       OrHighMed      244.77      (3.4%)      261.52      (3.3%)    6.8% (   0% -   13%) 0.000
                      AndHighMed      124.85      (3.4%)      134.47      (3.4%)    7.7% (   0% -   15%) 0.000
                      AndHighLow     1059.79      (2.7%)     1152.15      (2.2%)    8.7% (   3% -   14%) 0.000

All OrXY and AndYY tasks show a good speedup with a p-value equal to 0. I'm not sure about queries that show a regression, maybe they didn't like that the implementation of BlockDocsEnum#advance() is no longer forcefully inlined.

pforUtil.decodeAndPrefixSum(docIn, accum, docBuffer);
if (indexHasFreqs) {
pforUtil.decode(docIn, freqBuffer);
isFreqsRead = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the change needed to decode freqs lazily.


while (level >= 0) {
if (target > skipDoc[level]) {
if (target > skipDoc[level] || skipDoc[level] == 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit annoying I had to change skip lists, but it was required so that skip lists and postings blocks remain aligned. Otherwise advanceShallow(0) would set nextSkipDoc equal to 0 instead of the last doc ID of the first block.

@jpountz
Copy link
Contributor Author

jpountz commented Oct 12, 2023

I tried to re-inline advance() but it still suggested a small slowdown for HighTermDayOfYearSort and CountAndHighMed too so I suspect that there is something else. There is indeed one more check when one needs to move to a different block, but I would expect the cost to be dominated by the decoding of a full block in that case so that it shouldn't make a big difference.

The slowdown on CountOrHighHigh and CountOrHighMed reported above should be either noise or due to the JVM compiling things differently because of other changes, given that PostingsEnum#nextDoc() wasn't changed.

I'm tempted to not think too hard about HighTermDayOfYearSort and CountAndHighMed since there are more queries that get faster than slower with this change. While it's not big, I would expect the speedup on CountAndHighHigh to be real since it generally needs to advance postings by small increments, which is what the change to PostingsEnum#advance is supposed to help with.

@jpountz
Copy link
Contributor Author

jpountz commented Oct 12, 2023

Another potential reason for the slowdowns is that they're caused by the changes to MultiLevelSkipListReader. Removing the additional checks would require changing the way we encode postings and skip lists to start from -1 instead of 0, ie. changes on the write side too. (This feels like it would be a good change in general anyway, e.g. if you have a term that is present in every document today, we don't enable the "all deltas are the same" optimization for the first block of postings because the first delta in the block is actually 0 and not 1.)

@jpountz
Copy link
Contributor Author

jpountz commented Oct 13, 2023

I'll split this PR into smaller ones, so that it's easier to understand the impact of each change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant