Skip to content

[8.4] [MOD-12170] Implement ASM State Machine on notifications#7572

Merged
redisearch-backport-pull-request[bot] merged 1 commit into8.4from
backport-7331-to-8.4
Nov 29, 2025
Merged

[8.4] [MOD-12170] Implement ASM State Machine on notifications#7572
redisearch-backport-pull-request[bot] merged 1 commit into8.4from
backport-7331-to-8.4

Conversation

@redisearch-backport-pull-request
Copy link
Contributor

@redisearch-backport-pull-request redisearch-backport-pull-request bot commented Nov 29, 2025

Description

Backport of #7331 to 8.4.


Note

Wire ASM state machine into cluster migration/trim events, add slots-tracker FFI API and tests, and update headers/exports accordingly.

  • ASM State Machine:
    • Add src/asm_state_machine.h with APIs: ASM_StateMachine_SetLocalSlots, StartImport, CompleteImport, CompleteMigration, StartTrim, CompleteTrim, managing key_space_version via slots_tracker.
  • Notifications Integration (src/notifications.c):
    • Use ASM state machine in ClusterSlotMigrationEvent and ClusterSlotMigrationTrimEvent by consuming info->slots; start/end worker pool and drop local slots cache where needed.
    • Include asm_state_machine.h and remove unused data markers in these handlers.
  • Rust FFI / Headers:
    • Add slots_tracker_ffi dependency and re-export in redisearch_rs.
    • Update FFI: expose slots_tracker_set_local_slots and slots_tracker_mark_partially_available_slots (drop _internal), keep promote/mark/remove/check APIs; add slots_tracker_reset_for_testing.
    • Regenerate headers/slots_tracker.h to match new API; simplify cbindgen.toml (remove inline C helpers).
  • Tests:
    • Add tests/ctests/test_asm_state_machine.c covering initialization, import, continuous import, and migration trimming workflows.

Written by Cursor Bugbot for commit deddca4. This will update automatically on new commits. Configure here.

* first commit

* fix: handle lock in two phases

* inform ownership in regular command process handlers

* fix: handle proper init

* fix: use conditional variable

* fix: add some comments

* test: add test proving deadlock

* fix: changes as per comments

* fix: remove this locking from conn

* remove code where certainity is not so high

* fix potential issue

* test: add first unit tests version

* test: add proper testing with the value proposition

* test: improve testing

* alternative using auxiliary lock

* simplify thpool a bit

* make sure RedisModule_Yield is protected

* fix: fix issue raised by cursor

* fix: handle potential deadlock in drain also

* fix: fix potential TOCTOU concurrency bug

* fix: fix order of release

* fix: fix potential overflow issue

* fix tests

* fix: fix counts in all variables

* small refactor

* some refactoring of Shared Exclusive Lock

* simplify

* protect GILOwned simple bool

* clarify some comments

* set GILAlternativeLockHeld to true properly

* fix spelling

* add assertion

* fix: fix import RS_LOG_ASSERT

* test: add more conditions to testing

* add some more logic

* improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop

* test: handl test properly

* add another pattern of tests

* fix comments from cursor

* fix concurrency bug

* fix: fix potential race condition at release lock time

* fix: add condition

* test: add more testing

* force testing further to capture more potential errors

* parametrize tests

* test: make tests a little faster

* test: add micro benchmark

* fix compile microbenchmarks

* fix: avoid potential reentrant deadlocks

* fix: avoid potential reentrant deadlocks

* test: avoid leak in test

* test: avoid leak in test

* fix: fix assertion

* fix: fix assertion

* Simplify Shared Lock internals (#7267)

* simplify shared lock

* small improvement to set_timeout

* fix comment

* fix nd improve comments

* condition fix

* remove lock type from release API

* Add lock type back to the release API

* remove Unlocked from enum and handle clock init for macOS

* adapt to use new API

* chnange according to comments

* handle PR comments

* handle PR comments

* fix: take shared lock in other cases

* change as PR comments

* test: simplify test, do not allow query errors

* fix tests as per comments

* fix: handle number of high priority jobs running

* fix: fix test comment

* add ASM to help slot tracking in notifications

* test: add some testing idea

* change as per PR comments

* compile and link test fix

* change draining method to drain high priority

* add ctests for ASM State Machine

* test: add ASM tests

* test: complete tests

* change as per PR comments

* add micro benchmarks with jobs in threads

* remove changes not wanted

* checkout redis feature branch in task test

* remove draining

* move atomic to new header

* remove _internal naming

* fix formatting

---------

Co-authored-by: GuyAv46 <[email protected]>
(cherry picked from commit 44f0253)
in_asm_trim = true;
should_filter_slots = true;
workersThreadPool_OnEventStart();
ASM_StateMachine_StartTrim(slots);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Inconsistent ordering of state machine and thread pool calls

The ordering of ASM_StateMachine_* and workersThreadPool_* calls is inconsistent between IMPORT and TRIM event handlers. In IMPORT events, the state machine update happens before the thread pool notification, but in TRIM events the order is reversed. Since ASM_StateMachine_StartTrim updates the atomic key_space_version that worker threads use for validation, calling workersThreadPool_OnEventStart() first could allow workers to see stale version information before the slot state is properly updated. The pattern used in IMPORT events (state machine first, thread pool second) appears to be the intended order.

Additional Locations (1)

Fix in Cursor Fix in Web

@codecov
Copy link

codecov bot commented Nov 29, 2025

Codecov Report

❌ Patch coverage is 78.78788% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.89%. Comparing base (81137f0) to head (deddca4).
⚠️ Report is 2 commits behind head on 8.4.

Files with missing lines Patch % Lines
...earch_rs/c_entrypoint/slots_tracker_ffi/src/lib.rs 0.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              8.4    #7572      +/-   ##
==========================================
- Coverage   85.92%   85.89%   -0.04%     
==========================================
  Files         330      331       +1     
  Lines       52558    52587      +29     
  Branches    12001    12004       +3     
==========================================
+ Hits        45161    45169       +8     
- Misses       7230     7251      +21     
  Partials      167      167              
Flag Coverage Δ
flow 84.44% <80.76%> (-0.17%) ⬇️
unit 52.48% <51.51%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@redisearch-backport-pull-request redisearch-backport-pull-request bot added this pull request to the merge queue Nov 29, 2025
Merged via the queue into 8.4 with commit 5c03444 Nov 29, 2025
43 of 44 checks passed
@redisearch-backport-pull-request redisearch-backport-pull-request bot deleted the backport-7331-to-8.4 branch November 29, 2025 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enforce:coverage Run coverage flow even on draft pull request enforce:sanitize Run sanitizer flow even on draft pull request enforce:test Run the test suite even if no code has changed. size:L size:M size:XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant