[8.4] [MOD-12170] Implement ASM State Machine on notifications#7572
[8.4] [MOD-12170] Implement ASM State Machine on notifications#7572redisearch-backport-pull-request[bot] merged 1 commit into8.4from
Conversation
* first commit * fix: handle lock in two phases * inform ownership in regular command process handlers * fix: handle proper init * fix: use conditional variable * fix: add some comments * test: add test proving deadlock * fix: changes as per comments * fix: remove this locking from conn * remove code where certainity is not so high * fix potential issue * test: add first unit tests version * test: add proper testing with the value proposition * test: improve testing * alternative using auxiliary lock * simplify thpool a bit * make sure RedisModule_Yield is protected * fix: fix issue raised by cursor * fix: handle potential deadlock in drain also * fix: fix potential TOCTOU concurrency bug * fix: fix order of release * fix: fix potential overflow issue * fix tests * fix: fix counts in all variables * small refactor * some refactoring of Shared Exclusive Lock * simplify * protect GILOwned simple bool * clarify some comments * set GILAlternativeLockHeld to true properly * fix spelling * add assertion * fix: fix import RS_LOG_ASSERT * test: add more conditions to testing * add some more logic * improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop * test: handl test properly * add another pattern of tests * fix comments from cursor * fix concurrency bug * fix: fix potential race condition at release lock time * fix: add condition * test: add more testing * force testing further to capture more potential errors * parametrize tests * test: make tests a little faster * test: add micro benchmark * fix compile microbenchmarks * fix: avoid potential reentrant deadlocks * fix: avoid potential reentrant deadlocks * test: avoid leak in test * test: avoid leak in test * fix: fix assertion * fix: fix assertion * Simplify Shared Lock internals (#7267) * simplify shared lock * small improvement to set_timeout * fix comment * fix nd improve comments * condition fix * remove lock type from release API * Add lock type back to the release API * remove Unlocked from enum and handle clock init for macOS * adapt to use new API * chnange according to comments * handle PR comments * handle PR comments * fix: take shared lock in other cases * change as PR comments * test: simplify test, do not allow query errors * fix tests as per comments * fix: handle number of high priority jobs running * fix: fix test comment * add ASM to help slot tracking in notifications * test: add some testing idea * change as per PR comments * compile and link test fix * change draining method to drain high priority * add ctests for ASM State Machine * test: add ASM tests * test: complete tests * change as per PR comments * add micro benchmarks with jobs in threads * remove changes not wanted * checkout redis feature branch in task test * remove draining * move atomic to new header * remove _internal naming * fix formatting --------- Co-authored-by: GuyAv46 <[email protected]> (cherry picked from commit 44f0253)
| in_asm_trim = true; | ||
| should_filter_slots = true; | ||
| workersThreadPool_OnEventStart(); | ||
| ASM_StateMachine_StartTrim(slots); |
There was a problem hiding this comment.
Bug: Inconsistent ordering of state machine and thread pool calls
The ordering of ASM_StateMachine_* and workersThreadPool_* calls is inconsistent between IMPORT and TRIM event handlers. In IMPORT events, the state machine update happens before the thread pool notification, but in TRIM events the order is reversed. Since ASM_StateMachine_StartTrim updates the atomic key_space_version that worker threads use for validation, calling workersThreadPool_OnEventStart() first could allow workers to see stale version information before the slot state is properly updated. The pattern used in IMPORT events (state machine first, thread pool second) appears to be the intended order.
Additional Locations (1)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 8.4 #7572 +/- ##
==========================================
- Coverage 85.92% 85.89% -0.04%
==========================================
Files 330 331 +1
Lines 52558 52587 +29
Branches 12001 12004 +3
==========================================
+ Hits 45161 45169 +8
- Misses 7230 7251 +21
Partials 167 167
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
Backport of #7331 to
8.4.Note
Wire ASM state machine into cluster migration/trim events, add slots-tracker FFI API and tests, and update headers/exports accordingly.
src/asm_state_machine.hwith APIs:ASM_StateMachine_SetLocalSlots,StartImport,CompleteImport,CompleteMigration,StartTrim,CompleteTrim, managingkey_space_versionviaslots_tracker.src/notifications.c):ClusterSlotMigrationEventandClusterSlotMigrationTrimEventby consuminginfo->slots; start/end worker pool and drop local slots cache where needed.asm_state_machine.hand remove unuseddatamarkers in these handlers.slots_tracker_ffidependency and re-export inredisearch_rs.slots_tracker_set_local_slotsandslots_tracker_mark_partially_available_slots(drop_internal), keep promote/mark/remove/check APIs; addslots_tracker_reset_for_testing.headers/slots_tracker.hto match new API; simplifycbindgen.toml(remove inline C helpers).tests/ctests/test_asm_state_machine.ccovering initialization, import, continuous import, and migration trimming workflows.Written by Cursor Bugbot for commit deddca4. This will update automatically on new commits. Configure here.