Skip to content

Fix Numeric Tree Balance - [MOD-8081, MOD-8082]#5185

Merged
GuyAv46 merged 9 commits intomasterfrom
guyav-fix_numeric_tree
Nov 11, 2024
Merged

Fix Numeric Tree Balance - [MOD-8081, MOD-8082]#5185
GuyAv46 merged 9 commits intomasterfrom
guyav-fix_numeric_tree

Conversation

@GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Nov 8, 2024

Describe the changes in the pull request

Fixing 2 bugs in the numeric tree regarding the tree balancing

  1. A miscalculation when rotating a subtree can cause the maxDepth value to become larger than both child subtrees, which can later yield sub-optimal tree balancing.
  2. An assumption that when a leaf got split in the subtree, the maxDepth value has to grow by 1 (which is not true when the splitting leaf was not in the max depth), could cause the parent range to get released prematurely. (only when _NUMERIC_RANGES_PARENTS is set to 2)

Starting with this fix, we will also start balancing the root node to achieve a better tree.

Note:

This is a performance fix. Both bugs don't affect the correctness of any search.

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@fcostaoliveira
Copy link
Contributor

fcostaoliveira commented Nov 8, 2024

Automated performance analysis summary

This comment was automatically generated given there is performance data available.

In summary:

  • Detected a total of 32 stable tests between versions.
  • Detected a total of 7 highly unstable benchmarks.
  • Detected a total of 1 improvements above the improvement water line.
  • Detected a total of 5 regressions bellow the regression water line 5.0.

You can check a comparison in detail via the grafana link

Comparison between master and guyav-fix_numeric_tree.

Time Period from 30 days ago. (environment used: oss-standalone)

Test Case Baseline master (median obs. +- std.dev) Comparison guyav-fix_numeric_tree (median obs. +- std.dev) % change (higher-better) Note
ftsb-10K-enwiki_abstract-hashes-fulltext-sortby 69 +- 2.9% (7 datapoints) 68.0 -1.5% No Change
ftsb-10K-enwiki_abstract-hashes-term-prefix 7230 +- 5.5% (7 datapoints) 6308.0 -12.8% waterline=5.5%. REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-suffix 2099 +- 3.6% (7 datapoints) 1925.0 -8.3% REGRESSION
ftsb-10K-enwiki_abstract-hashes-term-suffix-withsuffixtrie 60878 +- 3.9% (7 datapoints) 62132.0 2.1% No Change
ftsb-10K-enwiki_abstract-hashes-term-wildcard 12157 +- 6.8% (7 datapoints) 11499.0 -5.4% waterline=6.8%. potential REGRESSION
ftsb-10K-enwiki_pages-hashes-fulltext-mixed_simple-1word-query_write_1_to_read_20.yml 1255 +- 3.4% (7 datapoints) 1140.0 -9.2% REGRESSION
ftsb-10K-enwiki_pages-hashes-load 43933 +- 3.5% (7 datapoints) 42785.0 -2.6% No Change
ftsb-10K-multivalue-numeric-json 795 +- 2.1% (7 datapoints) 796.0 0.1% No Change
ftsb-10K-singlevalue-numeric-json 326 +- 2.7% (7 datapoints) 317.0 -3.0% potential REGRESSION
ftsb-1K-enwiki_abstract-hashes-term-contains 1775 +- 5.5% (7 datapoints) 1645.0 -7.4% waterline=5.5%. REGRESSION
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query 335 +- 12.4% UNSTABLE (7 datapoints) 331.0 -1.1% UNSTABLE (very high variance)
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query-non-sortable 38 +- 9.0% (7 datapoints) 37.0 -2.0% waterline=9.0%. No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query@100_ops_sec 100 +- 0.0% (7 datapoints) 100.0 -0.0% No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query 2501 +- 9.1% (7 datapoints) 2532.0 1.2% waterline=9.1%. No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query-non-sortable 927 +- 14.1% UNSTABLE (7 datapoints) 1091.0 17.7% UNSTABLE (very high variance)
ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query@100_ops_sec 100 +- 0.0% (7 datapoints) 100.0 0.0% No Change
ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query 902 +- 18.6% UNSTABLE (7 datapoints) 1044.0 15.9% UNSTABLE (very high variance)
ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query@100_ops_sec 100 +- 0.0% (7 datapoints) 100.0 0.0% No Change
ftsb-1M-enwiki_abstract-hashes-load 21269 +- 5.9% (7 datapoints) 21501.0 1.1% waterline=5.9%. No Change
ftsb-1M-nyc_taxis-ftadd-load 17901 +- 2.6% (7 datapoints) 17532.0 -2.1% No Change
ftsb-1M-nyc_taxis-hashes-load 19020 +- 1.9% (7 datapoints) 18873.0 -0.8% No Change
search-aggregate-post-filter-simple.yml 81176 +- 3.5% (7 datapoints) 80601.0 -0.7% No Change
search-expire-doc-10-milliseconds 0.60 +- 2.5% (3 datapoints) 0.6 0.0%
search-expired-numeric-field-1000-seconds 0.60 +- 1.0% (3 datapoints) 0.6 0.0%
search-filtering-tag-numeric 181 +- 6.5% (7 datapoints) 174.0 -4.3% waterline=6.5%. potential REGRESSION
search-filtering-tag-numeric-filter-pipeline 18664 +- 2.9% (7 datapoints) 17792.0 -4.7% potential REGRESSION
search-ftsb-10K-enwiki_abstract-hashes-term-withoutsuffix-trie 36847 +- 2.5% (7 datapoints) 37889.0 2.8% No Change
search-ftsb-10K-enwiki_abstract-hashes-term-withsuffix-trie 37736 +- 3.1% (7 datapoints) 37958.0 0.6% No Change
search-ftsb-1700K-docs-union-iterators-q3 6.0 +- 3.3% (7 datapoints) 5.9 -1.8% No Change
search-ftsb-1M-enwiki_abstract-hashes-fulltext-2word-intersection-query-non-sortable@50_ops_sec 50 +- 8.8% (7 datapoints) 48.0 -4.1% waterline=8.8%. potential REGRESSION
search-ftsb-1M-enwiki_abstract-hashes-fulltext-2word-union-query-non-sortable@100_ops_sec 100 +- 0.0% (7 datapoints) 100.0 -0.0% No Change
search-ftsb-1M-enwiki_abstract-hashes-fulltext-simple-1word-query-non-sortable 153 +- 5.8% (7 datapoints) 148.0 -3.2% waterline=5.8%. potential REGRESSION
search-ftsb-370K-docs-union-iterators-q4 6.5 +- 2.1% (7 datapoints) 6.4 -1.7% No Change
search-geo 191 +- 3.2% (7 datapoints) 180.0 -5.6% REGRESSION
search-high-cardinality-negation-term-baseline 19 +- 5.4% (7 datapoints) 19.0 0.8% waterline=5.4%. No Change
search-high-cardinality-negation-term-comparison_union_all_other_terms 6.6 +- 3.4% (7 datapoints) 6.6 -0.6% No Change
search-numeric 3078 +- 27.9% UNSTABLE (7 datapoints) 2498.0 -18.8% UNSTABLE (very high variance)
search-numeric-optimize 8922 +- 2.1% (7 datapoints) 9110.0 2.1% No Change
search-numeric-sortby 3062 +- 26.7% UNSTABLE (7 datapoints) 2805.0 -8.4% UNSTABLE (very high variance)
search-numeric-sortby-desc 2766 +- 40.4% UNSTABLE (7 datapoints) 3040.0 9.9% UNSTABLE (very high variance)
search-numeric-sortby-desc-optimize 36 +- 34.2% UNSTABLE (7 datapoints) 31.0 -14.2% UNSTABLE (very high variance)
search-numeric-sortby-optimize 19 +- 2.9% (7 datapoints) 20.0 4.0% potential IMPROVEMENT
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-fulltext-filter 483 +- 5.2% (7 datapoints) 466.0 -3.6% waterline=5.2%. potential REGRESSION
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-numeric-filter 104 +- 9.9% (7 datapoints) 107.0 2.8% waterline=9.9%. No Change
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-tag-filter 51454 +- 4.6% (7 datapoints) 54864.0 6.6% IMPROVEMENT

@codecov
Copy link

codecov bot commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.50%. Comparing base (a93aa64) to head (c58d52c).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5185      +/-   ##
==========================================
+ Coverage   86.43%   86.50%   +0.07%     
==========================================
  Files         192      192              
  Lines       34784    34683     -101     
==========================================
- Hits        30066    30004      -62     
+ Misses       4718     4679      -39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@GuyAv46 GuyAv46 requested review from alonre24 and oshadmi November 10, 2024 07:27
Copy link
Collaborator

@raz-mon raz-mon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 💪 🔥
Some nitpicks mainly.
Some more documentation on the faulty scenarios and the fixes in the general comment would be good.

Copy link
Collaborator

@raz-mon raz-mon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💪

@GuyAv46 GuyAv46 added this pull request to the merge queue Nov 11, 2024
@GuyAv46 GuyAv46 removed this pull request from the merge queue due to a manual request Nov 11, 2024
@GuyAv46 GuyAv46 added this pull request to the merge queue Nov 11, 2024
Merged via the queue into master with commit 5884687 Nov 11, 2024
@GuyAv46 GuyAv46 deleted the guyav-fix_numeric_tree branch November 11, 2024 16:39
@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.8, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.8
git worktree add -d .worktree/backport-5185-to-2.8 origin/2.8
cd .worktree/backport-5185-to-2.8
git switch --create backport-5185-to-2.8
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.6
git worktree add -d .worktree/backport-5185-to-2.6 origin/2.6
cd .worktree/backport-5185-to-2.6
git switch --create backport-5185-to-2.6
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-5185-to-2.10 origin/2.10
cd .worktree/backport-5185-to-2.10
git switch --create backport-5185-to-2.10
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.0, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.0
git worktree add -d .worktree/backport-5185-to-8.0 origin/8.0
cd .worktree/backport-5185-to-8.0
git switch --create backport-5185-to-8.0
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@GuyAv46
Copy link
Collaborator Author

GuyAv46 commented Nov 11, 2024

/backport

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.8, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.8
git worktree add -d .worktree/backport-5185-to-2.8 origin/2.8
cd .worktree/backport-5185-to-2.8
git switch --create backport-5185-to-2.8
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.6
git worktree add -d .worktree/backport-5185-to-2.6 origin/2.6
cd .worktree/backport-5185-to-2.6
git switch --create backport-5185-to-2.6
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-5185-to-2.10 origin/2.10
cd .worktree/backport-5185-to-2.10
git switch --create backport-5185-to-2.10
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.0, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.0
git worktree add -d .worktree/backport-5185-to-8.0 origin/8.0
cd .worktree/backport-5185-to-8.0
git switch --create backport-5185-to-8.0
git cherry-pick -x 588468722680e3855f972f95580c299ae8bde2b1

GuyAv46 added a commit that referenced this pull request Nov 11, 2024
* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
GuyAv46 added a commit that referenced this pull request Nov 11, 2024
* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
GuyAv46 added a commit that referenced this pull request Nov 11, 2024
* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
GuyAv46 added a commit that referenced this pull request Nov 11, 2024
* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
github-merge-queue bot pushed a commit that referenced this pull request Nov 12, 2024
Fix Numeric Tree Balance - [MOD-8081, MOD-8082] (#5185)

* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
github-merge-queue bot pushed a commit that referenced this pull request Nov 12, 2024
Fix Numeric Tree Balance - [MOD-8081, MOD-8082] (#5185)

* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
github-merge-queue bot pushed a commit that referenced this pull request Nov 12, 2024
Fix Numeric Tree Balance - [MOD-8081, MOD-8082] (#5185)

* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
github-merge-queue bot pushed a commit that referenced this pull request Nov 12, 2024
Fix Numeric Tree Balance - [MOD-8081, MOD-8082] (#5185)

* add a failing test

* fix maxDepth calculation (balance logic)

* fix parent range depth logic

* improve test and remove a print

* fix test

* comment fixes

* tidy up

* simplified test and address CR

* code cleanup and address CR

(cherry picked from commit 5884687)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants