Skip to content

[MOD-12252] [MOD-11658] reduce number of worker threads asynchronously#7410

Merged
JoanFM merged 50 commits intomasterfrom
joan-config-changes-safe
Nov 25, 2025
Merged

[MOD-12252] [MOD-11658] reduce number of worker threads asynchronously#7410
JoanFM merged 50 commits intomasterfrom
joan-config-changes-safe

Conversation

@JoanFM
Copy link
Collaborator

@JoanFM JoanFM commented Nov 18, 2025

Describe the changes in the pull request

This PR aims to solve the deadlock while reducing number of threads and waiting for them to finish.

Changes:

  • Reduction of number of threads in the worker threadpool are done asynchronously, the effects are not inmediate.

Note

Adds an async thread-reduction mechanism and reliable thread start signaling to the threadpool, switches workers to non-blocking downsizing, and updates tests and tooling accordingly.

  • Threadpool (deps/thpool):
    • Add redisearch_thpool_schedule_config_reduce_threads_job(...) to asynchronously reduce threads (supports terminate-when-empty); mark pool UNINITIALIZED when removing all.
    • Refactor thread startup: thread_init(thpool, bool *started) and thread_do(void *) with start signaling; wait deterministically for new threads.
    • Extract redisearch_thpool_push_chain(...); add redisearch_thpool_add_n_work_not_verify_init(...) for enqueueing without init verification.
    • Improve redisearch_thpool_add_threads(...) to wait on per-thread start flags; logging/RS_ASSERT updates.
    • Remove synchronous APIs (redisearch_thpool_remove_threads, redisearch_thpool_terminate_when_empty) from headers; adjust internal uses.
  • Workers (src/util/workers.c):
    • Replace blocking downsizing with async redisearch_thpool_schedule_config_reduce_threads_job(...) for both partial and full reduction; keep activation/deactivation hooks; add drain logging.
  • Config/Debug/Notifications:
    • Keep coordinator update trigger after WORKERS change; warn that FT.DEBUG WORKERS DRAIN blocks; drain workers during loading progress.
  • Tests:
    • Update to async reduction semantics; add stress tests for reducing WORKERS under load (avoid deadlock), gradual scaling up/down, and various thread lifecycle scenarios.

Written by Cursor Bugbot for commit aa11f69. This will update automatically on new commits. Configure here.

@JoanFM JoanFM added enforce:coverage Run coverage flow even on draft pull request enforce:sanitize Run sanitizer flow even on draft pull request enforce:test Run the test suite even if no code has changed. labels Nov 18, 2025
@JoanFM JoanFM force-pushed the joan-config-changes-safe branch from 42b607c to 019aa87 Compare November 18, 2025 15:58
@JoanFM JoanFM force-pushed the joan-config-changes-safe branch from 019aa87 to faa37bd Compare November 18, 2025 16:09
@JoanFM JoanFM changed the title change number of threads asynchronously [MOD-12252] [MOD-11658] change number of threads asynchronously Nov 18, 2025
@JoanFM JoanFM force-pushed the joan-config-changes-safe branch from b44df96 to 1b28e6a Compare November 19, 2025 16:18
@JoanFM JoanFM requested review from GuyAv46 and alonre24 November 25, 2025 13:13
GuyAv46
GuyAv46 previously approved these changes Nov 25, 2025
@JoanFM JoanFM enabled auto-merge November 25, 2025 13:27
@JoanFM JoanFM added this pull request to the merge queue Nov 25, 2025
@JoanFM JoanFM removed this pull request from the merge queue due to a manual request Nov 25, 2025
@JoanFM JoanFM added this pull request to the merge queue Nov 25, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 25, 2025
@JoanFM JoanFM added this pull request to the merge queue Nov 25, 2025
Merged via the queue into master with commit 874ce68 Nov 25, 2025
26 checks passed
@JoanFM JoanFM deleted the joan-config-changes-safe branch November 25, 2025 23:56
@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.8, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.8
git worktree add -d .worktree/backport-7410-to-2.8 origin/2.8
cd .worktree/backport-7410-to-2.8
git switch --create backport-7410-to-2.8
git cherry-pick -x 874ce6826d1ea0ba59f281aa7b60b738ad49794b

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-7410-to-2.10 origin/2.10
cd .worktree/backport-7410-to-2.10
git switch --create backport-7410-to-2.10
git cherry-pick -x 874ce6826d1ea0ba59f281aa7b60b738ad49794b

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.2
git worktree add -d .worktree/backport-7410-to-8.2 origin/8.2
cd .worktree/backport-7410-to-8.2
git switch --create backport-7410-to-8.2
git cherry-pick -x 874ce6826d1ea0ba59f281aa7b60b738ad49794b

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.4, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.4
git worktree add -d .worktree/backport-7410-to-8.4 origin/8.4
cd .worktree/backport-7410-to-8.4
git switch --create backport-7410-to-8.4
git cherry-pick -x 874ce6826d1ea0ba59f281aa7b60b738ad49794b

JoanFM added a commit that referenced this pull request Nov 26, 2025
#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)
JoanFM added a commit that referenced this pull request Nov 26, 2025
#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)
JoanFM added a commit that referenced this pull request Nov 26, 2025
#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)
github-merge-queue bot pushed a commit that referenced this pull request Nov 26, 2025
…onously (#7519)

* [MOD-12252] [MOD-11658] reduce number of worker threads asynchronously (#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)

* remove test that is added in another PR

---------

Co-authored-by: GuyAv46 <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Nov 26, 2025
…ronously (#7535)

* [MOD-12252] [MOD-11658] reduce number of worker threads asynchronously (#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)

* fix compilation issue

* fix import threading on test

* fix import threading on test

---------

Co-authored-by: GuyAv46 <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Nov 26, 2025
…onously (#7534)

* [MOD-12252] [MOD-11658] reduce number of worker threads asynchronously (#7410)

* change number of threads asynchronously

* do some changes to add thread

* improve code

* cancel old config change if happens

* more improvements

* destroy when sure noone needsit

* simplify implementation drastically

* improve async implementation

* revert barrier changes

* test: do not skip on standalone

* test: add new test

* fix tests

* remove extra tests

* remove unneeded code

* delay ensure init after

* fix: remove weird unlock

* test: handle cpp tests

* small improvement on the test

* improve the checking of thread starting

* remove some unneeded functions from thpool

* remove unused code from tests

* small cleanup of code

* update deps/thpool/thpool.c

* add checks to thpool stats in test

* fix multithread test

* fix memory leak and remove unreachable code

* assert some invariants

* fix: respect adding workers is a lazy operation even after start

* small changes

* remove double assertion

* add another assertion

* small change to avoid dangling pointer in stack

* changes according to comments to the code

* change tests of place and simplify some

* small change

* try to improve flakiness and test time using TimeLimit

* make static and reuse code

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* add volatile to started array

* remove extra assertion

* add warning to ft debug command code comment

* change assertion method

* remove flakiness from test

* change test query while flush test

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

* Update deps/thpool/thpool.c

Co-authored-by: GuyAv46 <[email protected]>

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 874ce68)

* fix import threading on test

* fix import threading on test

---------

Co-authored-by: GuyAv46 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.8 backport 2.10 backport 8.2 backport 8.4 enforce:coverage Run coverage flow even on draft pull request enforce:sanitize Run sanitizer flow even on draft pull request enforce:test Run the test suite even if no code has changed. size:L size:M size:XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants