Skip to content

[MOD-12668] Use asynchronous jobs for in GC for SVS#7666

Merged
alonre24 merged 6 commits intomasterfrom
svs_reserve_threads_for_gc
Dec 9, 2025
Merged

[MOD-12668] Use asynchronous jobs for in GC for SVS#7666
alonre24 merged 6 commits intomasterfrom
svs_reserve_threads_for_gc

Conversation

@alonre24
Copy link
Collaborator

@alonre24 alonre24 commented Dec 7, 2025

Describe the changes in the pull request

Bump vecsim version to the one that reserves threads when we are executing SVS GC, which is using SVS thread pool for parallalism.
This includes extending the debug command that brings stats about the thread pool, so we can test properly that we are submitting jobs after calling the GC.

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Note

Expose numJobsInProgress in FT.DEBUG WORKERS stats and update VecSim SVS tests/INFO to use prefixes, verify GC async jobs, and correct deletion counts.

  • Threadpool / Debug:
    • Add num_jobs_in_progress to thpool_stats (in deps/thpool/thpool.h/.c) and include it in redisearch_thpool_get_stats.
    • Extend FT.DEBUG WORKERS stats reply with numJobsInProgress.
  • Tests:
    • Update tests/pytests/test_debug_commands.py to assert numJobsInProgress in workers stats.
    • Adjust tests/pytests/test_info_modules.py VecSim section:
      • Use PREFIX 'doc:' for HNSW/FLAT indexes; insert SVS-only docs.
      • Expect 3 marked-deleted vectors (HNSW×2 + SVS), verify backend counters, and drain workers after GC.
    • Enhance tests/pytests/test_vecsim_svs.py GC test:
      • Parameterize workers, pause/resume workers around GC, assert pending low-priority jobs equals worker count, then drain.
      • Import helpers/constants as needed.

Written by Cursor Bugbot for commit 46707cf. This will update automatically on new commits. Configure here.

@github-actions github-actions bot added the size:S label Dec 7, 2025
@alonre24 alonre24 changed the title svs_reserve_threads_for_gc [MOD-12668] Use asynchronous jobs for in GC for SVS Dec 7, 2025
@alonre24 alonre24 requested a review from meiravgri December 7, 2025 12:44
@codecov
Copy link

codecov bot commented Dec 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.78%. Comparing base (d3d21f6) to head (46707cf).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7666      +/-   ##
==========================================
- Coverage   84.81%   84.78%   -0.03%     
==========================================
  Files         349      349              
  Lines       54435    54437       +2     
  Branches    14500    14500              
==========================================
- Hits        46168    46155      -13     
- Misses       8072     8087      +15     
  Partials      195      195              
Flag Coverage Δ
flow 85.14% <100.00%> (-0.16%) ⬇️
unit 51.84% <50.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cur_workers_stats = getWorkersThpoolStats(env)

# GC background jobs for SVS should be pending in low priority queue.
env.assertEqual(cur_workers_stats['lowPriorityPendingJobs'], num_workers, message=f"{message_prefix}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing baseline for low priority pending jobs assertion

The test asserts that lowPriorityPendingJobs equals exactly num_workers after invoking GC, but it doesn't capture the baseline lowPriorityPendingJobs value before calling forceInvokeGC. If any low-priority jobs exist in the queue prior to GC invocation (e.g., from background indexing or deletion operations), the assertion will fail because it checks the total count rather than the delta. The code captures totalJobsDone before GC but should also capture lowPriorityPendingJobs to properly verify that GC added exactly num_workers new jobs.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The baseline lowPriorityPendingJobs value before calling forceInvokeGC should be zero.

@alonre24 alonre24 requested a review from meiravgri December 7, 2025 15:59
meiravgri
meiravgri previously approved these changes Dec 7, 2025
@meiravgri
Copy link
Collaborator

backport?

@alonre24
Copy link
Collaborator Author

alonre24 commented Dec 7, 2025

backport?

the backport should use vecsim 8.2, so it would have conflicts either way, I'll create a separate PR.

@alonre24 alonre24 enabled auto-merge December 7, 2025 16:35
@alonre24 alonre24 requested a review from meiravgri December 8, 2025 11:47
@alonre24 alonre24 added this pull request to the merge queue Dec 8, 2025
github-merge-queue bot pushed a commit that referenced this pull request Dec 8, 2025
* bump vecsim + add test (wip)

* Use and test GC jobs

* improve test with pauseing threadpool

* fix for test_redis_info_modules_vecsim to account for gc in svs

* fix cursor comments
@meiravgri meiravgri removed this pull request from the merge queue due to a manual request Dec 8, 2025
@meiravgri meiravgri added this pull request to the merge queue Dec 8, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 8, 2025
@alonre24 alonre24 added this pull request to the merge queue Dec 9, 2025
Merged via the queue into master with commit 2402970 Dec 9, 2025
26 checks passed
@alonre24 alonre24 deleted the svs_reserve_threads_for_gc branch December 9, 2025 11:11
@alonre24
Copy link
Collaborator Author

alonre24 commented Dec 9, 2025

/backport

redisearch-backport-pull-request bot pushed a commit that referenced this pull request Dec 9, 2025
* bump vecsim + add test (wip)

* Use and test GC jobs

* improve test with pauseing threadpool

* fix for test_redis_info_modules_vecsim to account for gc in svs

* fix cursor comments

(cherry picked from commit 2402970)
@redisearch-backport-pull-request
Copy link
Contributor

Successfully created backport PR for 8.4:

github-merge-queue bot pushed a commit that referenced this pull request Dec 10, 2025
[MOD-12668] Use asynchronous jobs for in GC for SVS (#7666)

* bump vecsim + add test (wip)

* Use and test GC jobs

* improve test with pauseing threadpool

* fix for test_redis_info_modules_vecsim to account for gc in svs

* fix cursor comments

(cherry picked from commit 2402970)

Co-authored-by: alonre24 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants