Skip to content

ASM Slots Protocol Communication - [MOD-11988, MOD-12308]#7253

Merged
JoanFM merged 140 commits intomasterfrom
guyav-ASM_slots_protocol
Nov 10, 2025
Merged

ASM Slots Protocol Communication - [MOD-11988, MOD-12308]#7253
JoanFM merged 140 commits intomasterfrom
guyav-ASM_slots_protocol

Conversation

@GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Nov 7, 2025

Describe the changes in the pull request

Continuing the work started by @JoanFM at #7129 and #7130, with a few simplifications and alignments toward #7200.

This PR implements the technical part of the protocol for FT.SEARCH, FT.AGGREGATE, and FT.HYBRID, including:

  1. Serializing and deserializing slots info.
  2. Attaching the right info to the right shard.
  3. Handle syntax errors for the new argument and make it mandatory for internal commands.
  4. Propagating it to the consumer (QueryIteratorRP).

In a following PR, after #7200 is also merged, we will actually validate the slots matches the local slots.

Mark if applicable

  • This PR introduces API changes
  • This PR introduces internal API changes
  • This PR introduces serialization changes

Note

Implements slot-range serialization, propagation, and parsing across coordinator and shards for FT.SEARCH/AGGREGATE/HYBRID, updates MRCommand/topology, and adds comprehensive tests.

  • Protocol & Parsing
    • Add little-endian slot-range (de)serialization in slot_ranges.[ch] and clone helpers.
    • Parse _SLOTS_INFO for internal requests in aggregate_request.c and hybrid parser; make mandatory for internal.
  • Coordinator & Command Wiring
    • Embed _SLOTS_INFO <binary> into outbound MR commands (MRCommand_PrepareForSlotInfo/SetSlotInfo).
    • Populate per-shard slot data in fanout and iterator send paths (cluster.c, rmr.c).
    • Include index prefixes and slot placeholders in built commands for FT.AGGREGATE and FT.HYBRID.
  • Topology & Cluster
    • Track shard slot ranges in topology (cluster_topology.[ch], redis_cluster.c, redise.c); deep-clone and free.
    • MR_UpdateTopology now accepts local slot ranges; drop cached local slots on refresh.
  • Request/Pipeline
    • Store querySlots/slotsVersion in AREQ and pass into pipeline/result processor (RPQueryIterator).
  • Utilities
    • Unify timeout counters to uint32_t.
    • Minor loop var type cleanups.
  • Tests
    • Add C++ tests for MRCommand slot blobs, AREQ compile with slots, slot-range (de)serialization; C tests for topology clone; Pytests for ACL/internal _SLOTS_INFO and hybrid flows.

Written by Cursor Bugbot for commit ddd3c14. This will update automatically on new commits. Configure here.

@GuyAv46 GuyAv46 requested review from JoanFM and alonre24 November 9, 2025 10:39
Copy link
Collaborator

@JoanFM JoanFM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should isolate unrelated changes to its own PR, and maybe clarify more some naming for slotRanges when specially 2 instances are involved in same structure.

@GuyAv46 GuyAv46 requested review from JoanFM and alonre24 November 10, 2025 12:08
@GuyAv46 GuyAv46 changed the title ASM Slots Protocol Communication - [MOD-12003] ASM Slots Protocol Communication - [MOD-11988, MOD-12308] Nov 10, 2025
@JoanFM JoanFM added this pull request to the merge queue Nov 10, 2025
Merged via the queue into master with commit e2ea609 Nov 10, 2025
19 checks passed
@JoanFM JoanFM deleted the guyav-ASM_slots_protocol branch November 10, 2025 23:37
redisearch-backport-pull-request bot pushed a commit that referenced this pull request Nov 10, 2025
* initial registration implementation

* Set up most of the notification events

* address review comment and add logs

* implement propagation

* minor improvements

* small fix

* review fixes

* [TEMP] set CI for new API

* dictionary tidy up

* add first test

* fix helper and test for duplicates

* Add missing sub-command

* add unit-tests for coverage

* change workersThreadPool_OnEvent* behavior to handle overlapping events

* rename events after API update

* implement managed and thread-safe slot ranges API

* drop cached API when needed

* rename API and use in the pipeline

* move includes to C file

* skip current flow test, and introduce a stable one

* fix initial reference count

* implement mock and add a test

* add missing key-space notification subscription

* add a test with adding a new node and propagating the schema

* review fixes

* fix assertion

* leak fix

* use malloc

* leak actual fix

* don't drop the cache on failed events

* add BG variants to flow tests

* joan add comments to place relevant code places

* initial registration implementation

* Set up most of the notification events

* address review comment and add logs

* implement propagation

* minor improvements

* small fix

* review fixes

* [TEMP] set CI for new API

* dictionary tidy up

* add first test

* fix helper and test for duplicates

* Add missing sub-command

* add unit-tests for coverage

* change workersThreadPool_OnEvent* behavior to handle overlapping events

* rename events after API update

* implement managed and thread-safe slot ranges API

* drop cached API when needed

* rename API and use in the pipeline

* move includes to C file

* skip current flow test, and introduce a stable one

* fix initial reference count

* implement mock and add a test

* add missing key-space notification subscription

* add a test with adding a new node and propagating the schema

* review fixes

* fix assertion

* leak fix

* use malloc

* leak actual fix

* don't drop the cache on failed events

* add BG variants to flow tests

* add more TODO signaling where code needs to change

* add serialization and deserialization in binary format and testing

* test: add more tests

* test: some adaptation to tests

* add MRCommand method to add slot ranges info

* test: change some tests

* improve the testing of Command

* add parametrization

* test: remove some parts of the command cpp tests

* remvoe some comments

* fix: protect in case no slotRanges

* test: add tests for HumanReadable capacity

* some refactoring

* fix problems merging

* fix problems merging

* fix problems merging

* fix problems merging

* fix problems merging

* refactor do not include num_ranges into the binary serialization

* add coordSlotRanges struct attribute

* code to parse the coordSlotRanges

* add the building block to parse in private Hybrid Requests

* test: fix test

* handle proper parsing

* test: add AREQ_Compile tests to see how slot ranges are properly parsed

* fix: fix serialize issue

* test: fix memory leak in test

* fix: change serialization to little endian

* fix: remove size sending from MR build command

* test: parametrize test to cover more cases

* add parsing from CLUSTER SHARDS output

* fix: fix issue not initialized coordSlotRanges

* handle CLUSTERSET parsing

* fix: avoid sending SLOTS for FT CURSOR and fix hybrid parsing

* fix: fix double free error

* fix: fix issue with DEBUG_COUNT on Debug commands

* fix issues with DEBUG command

* remove leftover TODO

* simplify SlotRangesArray_Serialize and SlotRangesArray_Deserialize

* refactor parsing and handling slots info

* cleanup

* revert unnecessary debug command changes

* verify arg is internal and mandatory

* align cluster slot parsing and make preparations for local refresh

* cleanup cluster topology

* add MRCommand_PrepareForSlotInfo and MRCommand_SetSlotInfo

* prepare search, aggregate, and hybrid

* set slots info for search, aggregate and hybrid

* fix warnings

* fix tests to new API

* fix tests

* fix hybrid slot parsing

* fix error codes after merging master

* improve comments and fix a leak in test

* allocation fix

* add slots info to tests of internal commands

* add asserts

* extend tests

* fix MRCommand_SetSlotInfo not affecting cached commands

* minor test improvements

* move SlotRangesArray_Deserialize validation to before any allocation is made

* rename _SLOTS to _SLOTS_INFO

* add asserts and comments

* wrap memmove logic in a macro

* test slotRanges on cloning test

* rename slotRanges_ to querySlots

* minor cleanup

---------

Co-authored-by: Joan Fontanals Martinez <[email protected]>
(cherry picked from commit e2ea609)
@redisearch-backport-pull-request
Copy link
Contributor

Successfully created backport PR for 8.4:

github-merge-queue bot pushed a commit that referenced this pull request Nov 11, 2025
* ASM Slots Protocol Communication - [MOD-11988, MOD-12308] (#7253)

* initial registration implementation

* Set up most of the notification events

* address review comment and add logs

* implement propagation

* minor improvements

* small fix

* review fixes

* [TEMP] set CI for new API

* dictionary tidy up

* add first test

* fix helper and test for duplicates

* Add missing sub-command

* add unit-tests for coverage

* change workersThreadPool_OnEvent* behavior to handle overlapping events

* rename events after API update

* implement managed and thread-safe slot ranges API

* drop cached API when needed

* rename API and use in the pipeline

* move includes to C file

* skip current flow test, and introduce a stable one

* fix initial reference count

* implement mock and add a test

* add missing key-space notification subscription

* add a test with adding a new node and propagating the schema

* review fixes

* fix assertion

* leak fix

* use malloc

* leak actual fix

* don't drop the cache on failed events

* add BG variants to flow tests

* joan add comments to place relevant code places

* initial registration implementation

* Set up most of the notification events

* address review comment and add logs

* implement propagation

* minor improvements

* small fix

* review fixes

* [TEMP] set CI for new API

* dictionary tidy up

* add first test

* fix helper and test for duplicates

* Add missing sub-command

* add unit-tests for coverage

* change workersThreadPool_OnEvent* behavior to handle overlapping events

* rename events after API update

* implement managed and thread-safe slot ranges API

* drop cached API when needed

* rename API and use in the pipeline

* move includes to C file

* skip current flow test, and introduce a stable one

* fix initial reference count

* implement mock and add a test

* add missing key-space notification subscription

* add a test with adding a new node and propagating the schema

* review fixes

* fix assertion

* leak fix

* use malloc

* leak actual fix

* don't drop the cache on failed events

* add BG variants to flow tests

* add more TODO signaling where code needs to change

* add serialization and deserialization in binary format and testing

* test: add more tests

* test: some adaptation to tests

* add MRCommand method to add slot ranges info

* test: change some tests

* improve the testing of Command

* add parametrization

* test: remove some parts of the command cpp tests

* remvoe some comments

* fix: protect in case no slotRanges

* test: add tests for HumanReadable capacity

* some refactoring

* fix problems merging

* fix problems merging

* fix problems merging

* fix problems merging

* fix problems merging

* refactor do not include num_ranges into the binary serialization

* add coordSlotRanges struct attribute

* code to parse the coordSlotRanges

* add the building block to parse in private Hybrid Requests

* test: fix test

* handle proper parsing

* test: add AREQ_Compile tests to see how slot ranges are properly parsed

* fix: fix serialize issue

* test: fix memory leak in test

* fix: change serialization to little endian

* fix: remove size sending from MR build command

* test: parametrize test to cover more cases

* add parsing from CLUSTER SHARDS output

* fix: fix issue not initialized coordSlotRanges

* handle CLUSTERSET parsing

* fix: avoid sending SLOTS for FT CURSOR and fix hybrid parsing

* fix: fix double free error

* fix: fix issue with DEBUG_COUNT on Debug commands

* fix issues with DEBUG command

* remove leftover TODO

* simplify SlotRangesArray_Serialize and SlotRangesArray_Deserialize

* refactor parsing and handling slots info

* cleanup

* revert unnecessary debug command changes

* verify arg is internal and mandatory

* align cluster slot parsing and make preparations for local refresh

* cleanup cluster topology

* add MRCommand_PrepareForSlotInfo and MRCommand_SetSlotInfo

* prepare search, aggregate, and hybrid

* set slots info for search, aggregate and hybrid

* fix warnings

* fix tests to new API

* fix tests

* fix hybrid slot parsing

* fix error codes after merging master

* improve comments and fix a leak in test

* allocation fix

* add slots info to tests of internal commands

* add asserts

* extend tests

* fix MRCommand_SetSlotInfo not affecting cached commands

* minor test improvements

* move SlotRangesArray_Deserialize validation to before any allocation is made

* rename _SLOTS to _SLOTS_INFO

* add asserts and comments

* wrap memmove logic in a macro

* test slotRanges on cloning test

* rename slotRanges_ to querySlots

* minor cleanup

---------

Co-authored-by: Joan Fontanals Martinez <[email protected]>
(cherry picked from commit e2ea609)

* fix error codes for 8.4

* add missing include

* remove self-managed sds

---------

Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: Joan Fontanals Martinez <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants