Performance improvement in indexBulkFields - [MOD-8093]#5186
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5186 +/- ##
==========================================
+ Coverage 86.43% 86.50% +0.07%
==========================================
Files 192 192
Lines 34784 34686 -98
==========================================
- Hits 30066 30006 -60
+ Misses 4718 4680 -38 ☔ View full report in Codecov by Sentry. |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.8
git worktree add -d .worktree/backport-5186-to-2.8 origin/2.8
cd .worktree/backport-5186-to-2.8
git switch --create backport-5186-to-2.8
git cherry-pick -x dfd463fe089bf2f0ef1bc495f206a6925df72ff7 |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.6
git worktree add -d .worktree/backport-5186-to-2.6 origin/2.6
cd .worktree/backport-5186-to-2.6
git switch --create backport-5186-to-2.6
git cherry-pick -x dfd463fe089bf2f0ef1bc495f206a6925df72ff7 |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.10
git worktree add -d .worktree/backport-5186-to-2.10 origin/2.10
cd .worktree/backport-5186-to-2.10
git switch --create backport-5186-to-2.10
git cherry-pick -x dfd463fe089bf2f0ef1bc495f206a6925df72ff7 |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 8.0
git worktree add -d .worktree/backport-5186-to-8.0 origin/8.0
cd .worktree/backport-5186-to-8.0
git switch --create backport-5186-to-8.0
git cherry-pick -x dfd463fe089bf2f0ef1bc495f206a6925df72ff7 |
* initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f)
* initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f)
* initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f)
* initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f)
| RedisModule_ModuleTypeSetValue(*idxKey, GeometryIndexType, idx); | ||
| return idx; | ||
| } | ||
| if (RedisModule_ModuleTypeGetType(*idxKey) == GeometryIndexType) { |
There was a problem hiding this comment.
@GuyAv46 We can also remove the global GeometryIndexType?
* Performance improvement in indexBulkFields - [MOD-8093] (#5186) * initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f) * fixes for 2.6
* Performance improvement in indexBulkFields - [MOD-8093] (#5186) * initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f) * fixes for 2.8
* Performance improvement in indexBulkFields - [MOD-8093] (#5186) * initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f) * remove const
* Performance improvement in indexBulkFields - [MOD-8093] (#5186) * initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f) * fix inverted index * remove const
* Performance improvement in indexBulkFields - [MOD-8093] (#5186) * initial cleanup * cleanup tag * cleanup geoshape * cleanup numeric * more cleanup * clean bulk data object * missed code cleanup * improve vecsim delete doc flow * another vecsim improvement (cherry picked from commit dfd463f) * fix inverted index * remove const
Describe the changes in the pull request
We are removing a redundant caching mechanism in
indexBulkFieldsthat is no longer in use and takes a long time to initialize.The caching mechanism helped with field keys and inverted index caching. The first is no longer relevant, and the second only helps when we use bulk indexing (which we don’t do anymore).
The redundant caching mechanism requires a large memory initialization, which takes a large percentage of the indexing time for no reason.
Note:
This is a performance fix, the bug does not affect the correctness of any search
Future work:
Removing the need for "formatted keys", stop using the spec's dictionary for fields, and store the relevant data in the field spec instead (today we usually extract the field data by first obtaining the field spec by its name, then getting the formatted redis string from it, and lastly performing a dictionary lookup by the formatted key name)
Mark if applicable