Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657]#7508
Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657]#7508
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for multiple slot ranges per shard in the search.CLUSTERSET command for Redis Enterprise deployments. Previously, each shard could only have a single contiguous slot range, but Redis Enterprise can assign multiple non-contiguous ranges to the same shard.
Key Changes
- Modified the internal
RLShardstructure to store an array of slot ranges instead of a single start/end pair - Refactored the topology parsing logic to use a dictionary for aggregating multiple range entries for the same shard ID
- Added comprehensive C++ unit tests covering various scenarios including multiple ranges, replicas, and error conditions
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coord/rmr/redise.c | Core parsing logic refactored to support multiple slot ranges per shard using dictionary-based aggregation |
| src/coord/rmr/redise.h | Added C++ extern linkage for the ParseTopology function |
| src/coord/rmr/cluster_topology.h | Exposed MRClusterNode_Free and MRClusterTopology_SortShards as public API |
| src/coord/rmr/cluster_topology.c | Moved sorting function from redis_cluster.c and made MRClusterNode_Free public |
| src/coord/rmr/redis_cluster.c | Removed local sortShards function, now using shared implementation |
| deps/rmutil/args.h | Added AC_GetU16 function declaration for parsing uint16_t values |
| deps/rmutil/args.c | Implemented AC_GetU16 function for slot range parsing |
| tests/cpptests/coord_tests/test_cpp_clusterset.cpp | Added comprehensive test suite with 30+ test cases |
| tests/cpptests/redismock/util.h | Added CreateArgv overload for std::vectorstd::string |
| tests/cpptests/redismock/util.cpp | Implemented CreateArgv for std::vector arguments |
| tests/cpptests/redismock/redismock.h | Added RMCK_GetLastError function declaration |
| tests/cpptests/redismock/redismock.cpp | Implemented ReplyWithError tracking and GetLastError |
| tests/cpptests/redismock/internal.h | Added last_error field to RedisModuleCtx for test validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
JoanFM
left a comment
There was a problem hiding this comment.
Are we sure we do not want to keep compatible with the two formats at least for a while?
JoanFM
left a comment
There was a problem hiding this comment.
HASREPLICATION can be removed easily now from the ParseTopology logic?
JoanFM
left a comment
There was a problem hiding this comment.
Is it tested when ADDR and details of SHARD do not come in the first SLOTRANGE, but in another one? Is it still properly parsed?
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7508 +/- ##
==========================================
- Coverage 85.03% 84.68% -0.36%
==========================================
Files 349 350 +1
Lines 53822 54107 +285
Branches 14384 14505 +121
==========================================
+ Hits 45769 45821 +52
- Misses 7857 8097 +240
+ Partials 196 189 -7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| MRClusterTopology *topo = RedisEnterprise_ParseTopology(ctx, argv, argc, &my_shard_idx); | ||
| // this means a parsing error, the parser already sent the explicit error to the client | ||
| if (!topo) { | ||
| RedisModule_Log(ctx, "warning", "Received invalid cluster topology"); |
There was a problem hiding this comment.
Let's add the faulty topology to the log
src/module.c
Outdated
| RedisModule_Log(ctx, "warning", "Received invalid cluster topology"); | ||
| return REDISMODULE_ERR; | ||
| } | ||
| RedisModule_Log(ctx, "notice", "Received new cluster topology with %u shards", topo->numShards); |
There was a problem hiding this comment.
Consider also adding the number of slot ranges tothe log
There was a problem hiding this comment.
@GuyAv46 One more debugability request.
Also, please document everything you agreed upon with regards to the protocol documented here - https://redislabs.atlassian.net/wiki/spaces/DX/pages/5478907950/SEARCH.CLUSTERSET2+command
* support multiple slot ranges * implement and move around helpers * add tests * ignore shards with no slots * improve parsing * better error message * fix flow tests * fix tests * sort by node id * improve error testing * cover missing cases * more error messages improvements * address AI review * stabilize Unexpected argument error path * stabilize more error paths * fix tests accordingly * last fix * add logs to cluster set command * add more logs per @alonre24 request (cherry picked from commit 5312b53)
|
Successfully created backport PR for |
…#7589) * Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657] (#7508) * support multiple slot ranges * implement and move around helpers * add tests * ignore shards with no slots * improve parsing * better error message * fix flow tests * fix tests * sort by node id * improve error testing * cover missing cases * more error messages improvements * address AI review * stabilize Unexpected argument error path * stabilize more error paths * fix tests accordingly * last fix * add logs to cluster set command * add more logs per @alonre24 request (cherry picked from commit 5312b53) * add missing fix to dictAddOrFind --------- Co-authored-by: GuyAv46 <[email protected]> Co-authored-by: GuyAv46 <[email protected]>
* fix: Avoid Rust cache contamination across platforms. (#7569) * [MOD-12170] Implement ASM State Machine on notifications (#7331) * first commit * fix: handle lock in two phases * inform ownership in regular command process handlers * fix: handle proper init * fix: use conditional variable * fix: add some comments * test: add test proving deadlock * fix: changes as per comments * fix: remove this locking from conn * remove code where certainity is not so high * fix potential issue * test: add first unit tests version * test: add proper testing with the value proposition * test: improve testing * alternative using auxiliary lock * simplify thpool a bit * make sure RedisModule_Yield is protected * fix: fix issue raised by cursor * fix: handle potential deadlock in drain also * fix: fix potential TOCTOU concurrency bug * fix: fix order of release * fix: fix potential overflow issue * fix tests * fix: fix counts in all variables * small refactor * some refactoring of Shared Exclusive Lock * simplify * protect GILOwned simple bool * clarify some comments * set GILAlternativeLockHeld to true properly * fix spelling * add assertion * fix: fix import RS_LOG_ASSERT * test: add more conditions to testing * add some more logic * improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop * test: handl test properly * add another pattern of tests * fix comments from cursor * fix concurrency bug * fix: fix potential race condition at release lock time * fix: add condition * test: add more testing * force testing further to capture more potential errors * parametrize tests * test: make tests a little faster * test: add micro benchmark * fix compile microbenchmarks * fix: avoid potential reentrant deadlocks * fix: avoid potential reentrant deadlocks * test: avoid leak in test * test: avoid leak in test * fix: fix assertion * fix: fix assertion * Simplify Shared Lock internals (#7267) * simplify shared lock * small improvement to set_timeout * fix comment * fix nd improve comments * condition fix * remove lock type from release API * Add lock type back to the release API * remove Unlocked from enum and handle clock init for macOS * adapt to use new API * chnange according to comments * handle PR comments * handle PR comments * fix: take shared lock in other cases * change as PR comments * test: simplify test, do not allow query errors * fix tests as per comments * fix: handle number of high priority jobs running * fix: fix test comment * add ASM to help slot tracking in notifications * test: add some testing idea * change as per PR comments * compile and link test fix * change draining method to drain high priority * add ctests for ASM State Machine * test: add ASM tests * test: complete tests * change as per PR comments * add micro benchmarks with jobs in threads * remove changes not wanted * checkout redis feature branch in task test * remove draining * move atomic to new header * remove _internal naming * fix formatting --------- Co-authored-by: GuyAv46 <[email protected]> * [MOD-12627] Add Debug Support for `FT.PROFILE` Command (#7510) * imp debug profile for SA: introduce in mocule.h: RSProfileCommandImp RSProfileCommand calls RSProfileCommandImp(isDebug = false) for regular execution ProfileCommandCommand_DebugWrapper mcalls it with isDebug=true and skips _FT.DEBUG introduce entrypoint for _FT.DEBUG FT.PROFILE in debug_commands: ProfileCommandCommand_DebugWrapper RSProfileCommandImp calls DEBUG_execCommandCommon is its debug _recursiveProfilePrint skips printing debug RP * pass is debug instead of extracting: module.h: replace declaration: DistAggregateCommand DistSearchCommand with Imp version that receives isDebug expose ProfileCommandHandlerImp align debug_commands introducr _FT.DEBUG _FT.PROFILE * add test for cluster * return res * augi fixes * fix spell check * fix for real * fix test * skip tests according to env * revrt test_profile changes * reove changes from internal_only * [MOD-12694] [MOD-12069] Add active_coord_threads metric (#7546) * Add multi-threading statistics tracking for active I/O threads * fix comment * add cpp test * fix spelling * address comment * remove unnecessary nre line * add "active_worker_threads" metric * fix comment imp tests * test cleanups * add test comne about the num queries and cleanups * fix declartion * remove coord threads * add active_coord_threads expose ConcurrentSearchPool_WorkingThreadCount * make the tests run... * add "active_worker_threads" metric * fix declartion * remove coord threads * make the tests run... * introduce workersThreadPool_isInitialized assert is initizlied in GlobalStats_GetMultiThreadingStats * cleanup * rename workersThreadPool_isCreated * introduce ConcurrentSearchPool_IsCreated * fix test * we dont need workers * remove ConcurrentSearchPool_IsCreated and workersThreadPool_isInitialized * fix merge * [MOD-12789] test: fix flaky thpool test (#7581) test: fix flaky thpool test * Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657] (#7508) * support multiple slot ranges * implement and move around helpers * add tests * ignore shards with no slots * improve parsing * better error message * fix flow tests * fix tests * sort by node id * improve error testing * cover missing cases * more error messages improvements * address AI review * stabilize Unexpected argument error path * stabilize more error paths * fix tests accordingly * last fix * add logs to cluster set command * add more logs per @alonre24 request * rename MetricIterator to Metric (#7586) The other iterators are not suffixed with 'Iterator'. * [MOD-12701] Split the execution of Rust and C/C++ unit tests across two different CI steps. (#7587) Split the execution of Rust and C/C++ unit tests across two different CI steps. * [MOD-12519] implement skip multi in II iterators (#7426) * remove it->skipMulti No longer needed as the full iterators have been removed. The query ones always set it to true. * implement skip multi II query iterators are supposed to skip results having the same ids. Test ported from test_cpp_iterator_index.cpp * inline read() and skip_to() * test more ii iterator edge cases Port of GetCorrectValue and EOFAfterFiltering from test_cpp_iterator_index.cpp * Compress layers prior to exporting them to the Docker layer cache (#7529) * Compress Docker layers prior to exporting them. * Ignore boost subfolders recursively * [MOD-12417] Track maxprefixexpansions errors and warnings in info (#7570) track maxprefixexpanions * [MOD-12409]: Port DocumentType enum to Rust (#7590) Port DocumentType enum to Rust Port DocumentType enum to Rust as `document::DocumentType`, removing it from `redisearch.h`. Use `document::DocumentType` in `rlookup` instead of `rlookup::bindings::DocumentType`. * [MOD-12069] Add `*_pending_jobs` metrics (#7556) * align info/* to active_coord * add APIs to get queues length * add to info * fix * test * fix test * catch general error * rename * fix moduleArgs * rename * rename test_active_worker_threads * rename to wworketrs * [MOD-12392] Remove numDocs parameter from non-optimized Wildcard iterator (#7602) Remove numDocs parameter from non-optimized Wildcard iterator * [MOD-12701] Enforce a per-test timeout in the C++ rstest suite using ctest (#7588) * Enforce a per-test timeout in the C++ rstest suite using ctest * Disable problematic tests * Raise timeout to 60s * Raise timeout * Add a timeout for coordinator tests too * Skip ActivateIoThreadsMetric test * Register Cursor Sub-Commands as such - [MOD-12807, MOD-12808] (#7571) * split cursor command * fix and improve tests * cover error cases * fix cursor leaks * Add "TODO: run hybrid cursor" back * remove new empty line * small test improvement * fix FT.CURSOR GC * de-flake test * make CURSOR PROFILE internal only * test the free * Keep just the prints * test on macos and noble as well * Move the free * Remove * Add to non container * Back to regular test * Remove spaces * Moved the remove and add the repo size * Moved it again * Move it to after repo build * Move print after repo build * test all * run the temp flow * add mount to container * Fix curly --------- Co-authored-by: Luca Palmieri <[email protected]> Co-authored-by: Joan Fontanals <[email protected]> Co-authored-by: GuyAv46 <[email protected]> Co-authored-by: meiravgri <[email protected]> Co-authored-by: Guillaume Desmottes <[email protected]> Co-authored-by: lerman25 <[email protected]> Co-authored-by: Henk Oordt <[email protected]> Co-authored-by: alonre24 <[email protected]>
* Reduce merge queue * add ability to run manually * CR comments * allow workflow call for testing * Fix naming in build image + remove quick from flow intel as well * use ubuntu nobel rather than latest * fix noble typo * CR fixes * CR fixes 2 * don't use container for cov and san * restore leftover * remove mac + intel and workflow call * remove macos intel from matrix * change back ubuntu:latest to ubuntu:noble in merge-to-queue as per Jonathan comment * measure disk space * fix step name * Free disk on container (RediSearch#7613) * fix: Avoid Rust cache contamination across platforms. (RediSearch#7569) * [MOD-12170] Implement ASM State Machine on notifications (RediSearch#7331) * first commit * fix: handle lock in two phases * inform ownership in regular command process handlers * fix: handle proper init * fix: use conditional variable * fix: add some comments * test: add test proving deadlock * fix: changes as per comments * fix: remove this locking from conn * remove code where certainity is not so high * fix potential issue * test: add first unit tests version * test: add proper testing with the value proposition * test: improve testing * alternative using auxiliary lock * simplify thpool a bit * make sure RedisModule_Yield is protected * fix: fix issue raised by cursor * fix: handle potential deadlock in drain also * fix: fix potential TOCTOU concurrency bug * fix: fix order of release * fix: fix potential overflow issue * fix tests * fix: fix counts in all variables * small refactor * some refactoring of Shared Exclusive Lock * simplify * protect GILOwned simple bool * clarify some comments * set GILAlternativeLockHeld to true properly * fix spelling * add assertion * fix: fix import RS_LOG_ASSERT * test: add more conditions to testing * add some more logic * improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop * test: handl test properly * add another pattern of tests * fix comments from cursor * fix concurrency bug * fix: fix potential race condition at release lock time * fix: add condition * test: add more testing * force testing further to capture more potential errors * parametrize tests * test: make tests a little faster * test: add micro benchmark * fix compile microbenchmarks * fix: avoid potential reentrant deadlocks * fix: avoid potential reentrant deadlocks * test: avoid leak in test * test: avoid leak in test * fix: fix assertion * fix: fix assertion * Simplify Shared Lock internals (RediSearch#7267) * simplify shared lock * small improvement to set_timeout * fix comment * fix nd improve comments * condition fix * remove lock type from release API * Add lock type back to the release API * remove Unlocked from enum and handle clock init for macOS * adapt to use new API * chnange according to comments * handle PR comments * handle PR comments * fix: take shared lock in other cases * change as PR comments * test: simplify test, do not allow query errors * fix tests as per comments * fix: handle number of high priority jobs running * fix: fix test comment * add ASM to help slot tracking in notifications * test: add some testing idea * change as per PR comments * compile and link test fix * change draining method to drain high priority * add ctests for ASM State Machine * test: add ASM tests * test: complete tests * change as per PR comments * add micro benchmarks with jobs in threads * remove changes not wanted * checkout redis feature branch in task test * remove draining * move atomic to new header * remove _internal naming * fix formatting --------- Co-authored-by: GuyAv46 <[email protected]> * [MOD-12627] Add Debug Support for `FT.PROFILE` Command (RediSearch#7510) * imp debug profile for SA: introduce in mocule.h: RSProfileCommandImp RSProfileCommand calls RSProfileCommandImp(isDebug = false) for regular execution ProfileCommandCommand_DebugWrapper mcalls it with isDebug=true and skips _FT.DEBUG introduce entrypoint for _FT.DEBUG FT.PROFILE in debug_commands: ProfileCommandCommand_DebugWrapper RSProfileCommandImp calls DEBUG_execCommandCommon is its debug _recursiveProfilePrint skips printing debug RP * pass is debug instead of extracting: module.h: replace declaration: DistAggregateCommand DistSearchCommand with Imp version that receives isDebug expose ProfileCommandHandlerImp align debug_commands introducr _FT.DEBUG _FT.PROFILE * add test for cluster * return res * augi fixes * fix spell check * fix for real * fix test * skip tests according to env * revrt test_profile changes * reove changes from internal_only * [MOD-12694] [MOD-12069] Add active_coord_threads metric (RediSearch#7546) * Add multi-threading statistics tracking for active I/O threads * fix comment * add cpp test * fix spelling * address comment * remove unnecessary nre line * add "active_worker_threads" metric * fix comment imp tests * test cleanups * add test comne about the num queries and cleanups * fix declartion * remove coord threads * add active_coord_threads expose ConcurrentSearchPool_WorkingThreadCount * make the tests run... * add "active_worker_threads" metric * fix declartion * remove coord threads * make the tests run... * introduce workersThreadPool_isInitialized assert is initizlied in GlobalStats_GetMultiThreadingStats * cleanup * rename workersThreadPool_isCreated * introduce ConcurrentSearchPool_IsCreated * fix test * we dont need workers * remove ConcurrentSearchPool_IsCreated and workersThreadPool_isInitialized * fix merge * [MOD-12789] test: fix flaky thpool test (RediSearch#7581) test: fix flaky thpool test * Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657] (RediSearch#7508) * support multiple slot ranges * implement and move around helpers * add tests * ignore shards with no slots * improve parsing * better error message * fix flow tests * fix tests * sort by node id * improve error testing * cover missing cases * more error messages improvements * address AI review * stabilize Unexpected argument error path * stabilize more error paths * fix tests accordingly * last fix * add logs to cluster set command * add more logs per @alonre24 request * rename MetricIterator to Metric (RediSearch#7586) The other iterators are not suffixed with 'Iterator'. * [MOD-12701] Split the execution of Rust and C/C++ unit tests across two different CI steps. (RediSearch#7587) Split the execution of Rust and C/C++ unit tests across two different CI steps. * [MOD-12519] implement skip multi in II iterators (RediSearch#7426) * remove it->skipMulti No longer needed as the full iterators have been removed. The query ones always set it to true. * implement skip multi II query iterators are supposed to skip results having the same ids. Test ported from test_cpp_iterator_index.cpp * inline read() and skip_to() * test more ii iterator edge cases Port of GetCorrectValue and EOFAfterFiltering from test_cpp_iterator_index.cpp * Compress layers prior to exporting them to the Docker layer cache (RediSearch#7529) * Compress Docker layers prior to exporting them. * Ignore boost subfolders recursively * [MOD-12417] Track maxprefixexpansions errors and warnings in info (RediSearch#7570) track maxprefixexpanions * [MOD-12409]: Port DocumentType enum to Rust (RediSearch#7590) Port DocumentType enum to Rust Port DocumentType enum to Rust as `document::DocumentType`, removing it from `redisearch.h`. Use `document::DocumentType` in `rlookup` instead of `rlookup::bindings::DocumentType`. * [MOD-12069] Add `*_pending_jobs` metrics (RediSearch#7556) * align info/* to active_coord * add APIs to get queues length * add to info * fix * test * fix test * catch general error * rename * fix moduleArgs * rename * rename test_active_worker_threads * rename to wworketrs * [MOD-12392] Remove numDocs parameter from non-optimized Wildcard iterator (RediSearch#7602) Remove numDocs parameter from non-optimized Wildcard iterator * [MOD-12701] Enforce a per-test timeout in the C++ rstest suite using ctest (RediSearch#7588) * Enforce a per-test timeout in the C++ rstest suite using ctest * Disable problematic tests * Raise timeout to 60s * Raise timeout * Add a timeout for coordinator tests too * Skip ActivateIoThreadsMetric test * Register Cursor Sub-Commands as such - [MOD-12807, MOD-12808] (RediSearch#7571) * split cursor command * fix and improve tests * cover error cases * fix cursor leaks * Add "TODO: run hybrid cursor" back * remove new empty line * small test improvement * fix FT.CURSOR GC * de-flake test * make CURSOR PROFILE internal only * test the free * Keep just the prints * test on macos and noble as well * Move the free * Remove * Add to non container * Back to regular test * Remove spaces * Moved the remove and add the repo size * Moved it again * Move it to after repo build * Move print after repo build * test all * run the temp flow * add mount to container * Fix curly --------- Co-authored-by: Luca Palmieri <[email protected]> Co-authored-by: Joan Fontanals <[email protected]> Co-authored-by: GuyAv46 <[email protected]> Co-authored-by: meiravgri <[email protected]> Co-authored-by: Guillaume Desmottes <[email protected]> Co-authored-by: lerman25 <[email protected]> Co-authored-by: Henk Oordt <[email protected]> Co-authored-by: alonre24 <[email protected]> * Revert "Free disk on container (RediSearch#7613)" This reverts commit 9c68740. * add free disk step * rename + remove temp test * remove leftover * better readability in container input Co-authored-by: GuyAv46 <[email protected]> * fix double defaults --------- Co-authored-by: dor-forer <[email protected]> Co-authored-by: Luca Palmieri <[email protected]> Co-authored-by: Joan Fontanals <[email protected]> Co-authored-by: GuyAv46 <[email protected]> Co-authored-by: meiravgri <[email protected]> Co-authored-by: Guillaume Desmottes <[email protected]> Co-authored-by: lerman25 <[email protected]> Co-authored-by: Henk Oordt <[email protected]>
Mark if applicable
Note
Add multi-range-per-shard support and a stricter
SEARCH.CLUSTERSETparser with deterministic shard ordering, updated topology utilities, and extensive tests/mocks.search.CLUSTERSET):SLOTRANGEs per shard, replica ignoring, endpoint/UNIX socket validation, and detailed offset-based errors.MYID), sort shards deterministically, and build topology accordingly.MRClusterTopology_SortShards(sort by node id) and use it in Redis OSS topology fetch.MRClusterNode_Freein header.AC_GetU16helper for 16-bit slot values.SEARCH.CLUSTERINFO.ReplyWithErrorFormat; add argv-from-vector helpers.Written by Cursor Bugbot for commit 6e1abd0. This will update automatically on new commits. Configure here.