Skip to content

[MOD-12519] implement skip multi in II iterators#7426

Merged
gdesmott merged 4 commits intomasterfrom
gd_it_skip_multi
Dec 1, 2025
Merged

[MOD-12519] implement skip multi in II iterators#7426
gdesmott merged 4 commits intomasterfrom
gd_it_skip_multi

Conversation

@gdesmott
Copy link
Collaborator

@gdesmott gdesmott commented Nov 19, 2025

Implement InvIndIterator_Read_SkipMulti in Rust.
Also port more edge cases tests from the C++ tests.

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Note

Implements skip-multi in Rust inverted index iterators with dynamic read dispatch, and refactors C iterators to infer skip-multi (removing the parameter) while adding targeted numeric tests.

  • Iterators:
    • C (src/iterators/inverted_index_iterator.c/.h):
      • Remove skipMulti field and skipMulti parameter from constructors; infer via IndexReader_HasMulti in ShouldSkipMulti.
      • Initialize Read function based on skipMulti and HasExpiration, selecting among Read_Default, Read_SkipMulti, Read_CheckExpiration, Read_SkipMulti_CheckExpiration.
      • Update all NewInvIndIterator_* call sites to drop the skipMulti arg; minor local var fix for skipMulti.
    • Rust (src/redisearch_rs/rqe_iterators/src/inverted_index.rs):
      • Add dynamic read_impl selection at construction using reader.has_duplicates().
      • Implement read_skip_multi and delegate read() to the selected impl.
  • Tests:
    • Add numeric iterator tests for duplicate doc IDs handling, filtered reads, and EOF behavior; adjust imports to use InvertedIndex, RQEIterator.

Written by Cursor Bugbot for commit 9c3353e. This will update automatically on new commits. Configure here.

@gdesmott gdesmott requested a review from GlenDC November 19, 2025 15:54
@gdesmott gdesmott changed the title implement skip multi in II iterators [MOD-12519] implement skip multi in II iterators Nov 19, 2025
GlenDC
GlenDC previously approved these changes Nov 20, 2025
@codecov
Copy link

codecov bot commented Nov 21, 2025

Codecov Report

❌ Patch coverage is 94.59459% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.71%. Comparing base (b35b3ee) to head (9c3353e).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
.../redisearch_rs/rqe_iterators/src/inverted_index.rs 92.85% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7426      +/-   ##
==========================================
+ Coverage   84.70%   84.71%   +0.01%     
==========================================
  Files         350      350              
  Lines       54017    54038      +21     
  Branches    14505    14528      +23     
==========================================
+ Hits        45756    45780      +24     
+ Misses       8072     8067       -5     
- Partials      189      191       +2     
Flag Coverage Δ
flow 85.16% <100.00%> (-0.07%) ⬇️
unit 52.06% <94.59%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gdesmott gdesmott force-pushed the gd_filter_it branch 2 times, most recently from 9654182 to 870d9f2 Compare November 24, 2025 11:21
BenGoldberger
BenGoldberger previously approved these changes Nov 24, 2025
@gdesmott gdesmott force-pushed the gd_filter_it branch 2 times, most recently from 2525f6f to 82d5350 Compare November 25, 2025 08:13
Base automatically changed from gd_filter_it to master November 26, 2025 16:28
@chesedo chesedo dismissed stale reviews from BenGoldberger and GlenDC November 26, 2025 16:28

The base branch was changed.

No longer needed as the full iterators have been removed.
The query ones always set it to true.
II query iterators are supposed to skip results having the same ids.

Test ported from test_cpp_iterator_index.cpp
Port of GetCorrectValue and EOFAfterFiltering from test_cpp_iterator_index.cpp
@gdesmott gdesmott added this pull request to the merge queue Dec 1, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 1, 2025
@gdesmott gdesmott added this pull request to the merge queue Dec 1, 2025
Merged via the queue into master with commit 6fac654 Dec 1, 2025
26 checks passed
@gdesmott gdesmott deleted the gd_it_skip_multi branch December 1, 2025 15:01
alonre24 added a commit that referenced this pull request Dec 3, 2025
* fix: Avoid Rust cache contamination across platforms. (#7569)

* [MOD-12170] Implement ASM State Machine on notifications (#7331)

* first commit

* fix: handle lock in two phases

* inform ownership in regular command process handlers

* fix: handle proper init

* fix: use conditional variable

* fix: add some comments

* test: add test proving deadlock

* fix: changes as per comments

* fix: remove this locking from conn

* remove code where certainity is not so high

* fix potential issue

* test: add first unit tests version

* test: add proper testing with the value proposition

* test: improve testing

* alternative using auxiliary lock

* simplify thpool a bit

* make sure RedisModule_Yield is protected

* fix: fix issue raised by cursor

* fix: handle potential deadlock in drain also

* fix: fix potential TOCTOU concurrency bug

* fix: fix order of release

* fix: fix potential overflow issue

* fix tests

* fix: fix counts in all variables

* small refactor

* some refactoring of Shared Exclusive Lock

* simplify

* protect GILOwned simple bool

* clarify some comments

* set GILAlternativeLockHeld to true properly

* fix spelling

* add assertion

* fix: fix import RS_LOG_ASSERT

* test: add more conditions to testing

* add some more logic

* improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop

* test: handl test properly

* add another pattern of tests

* fix comments from cursor

* fix concurrency bug

* fix: fix potential race condition at release lock time

* fix: add condition

* test: add more testing

* force testing further to capture more potential errors

* parametrize tests

* test: make tests a little faster

* test: add micro benchmark

* fix compile microbenchmarks

* fix: avoid potential reentrant deadlocks

* fix: avoid potential reentrant deadlocks

* test: avoid leak in test

* test: avoid leak in test

* fix: fix assertion

* fix: fix assertion

* Simplify Shared Lock internals (#7267)

* simplify shared lock

* small improvement to set_timeout

* fix comment

* fix nd improve comments

* condition fix

* remove lock type from release API

* Add lock type back to the release API

* remove Unlocked from enum and handle clock init for macOS

* adapt to use new API

* chnange according to comments

* handle PR comments

* handle PR comments

* fix: take shared lock in other cases

* change as PR comments

* test: simplify test, do not allow query errors

* fix tests as per comments

* fix: handle number of high priority jobs running

* fix: fix test comment

* add ASM to help slot tracking in notifications

* test: add some testing idea

* change as per PR comments

* compile and link test fix

* change draining method to drain high priority

* add ctests for ASM State Machine

* test: add ASM tests

* test: complete tests

* change as per PR comments

* add micro benchmarks with jobs in threads

* remove changes not wanted

* checkout redis feature branch in task test

* remove draining

* move atomic to new header

* remove _internal naming

* fix formatting

---------

Co-authored-by: GuyAv46 <[email protected]>

* [MOD-12627] Add Debug Support for `FT.PROFILE` Command (#7510)

* imp debug profile for SA:

introduce in mocule.h: RSProfileCommandImp
RSProfileCommand calls RSProfileCommandImp(isDebug = false) for regular execution
ProfileCommandCommand_DebugWrapper mcalls it with isDebug=true and skips _FT.DEBUG

introduce entrypoint for _FT.DEBUG FT.PROFILE in debug_commands:
ProfileCommandCommand_DebugWrapper

RSProfileCommandImp calls DEBUG_execCommandCommon is its debug

_recursiveProfilePrint skips printing debug RP

* pass is debug instead of extracting:

module.h:
replace declaration: DistAggregateCommand
DistSearchCommand
with Imp version that receives isDebug

expose ProfileCommandHandlerImp

align debug_commands

introducr _FT.DEBUG _FT.PROFILE

* add test for cluster

* return res

* augi fixes

* fix spell check

* fix for real

* fix test

* skip tests according to env

* revrt test_profile changes

* reove changes from internal_only

* [MOD-12694] [MOD-12069] Add active_coord_threads metric (#7546)

* Add multi-threading statistics tracking for active I/O threads

* fix comment

* add cpp test

* fix spelling

* address comment

* remove unnecessary nre line

* add "active_worker_threads" metric

* fix comment
imp tests

* test cleanups

* add test comne about the num queries and cleanups

* fix declartion

* remove coord threads

* add active_coord_threads

expose ConcurrentSearchPool_WorkingThreadCount

* make the tests run...

* add "active_worker_threads" metric

* fix declartion

* remove coord threads

* make the tests run...

* introduce workersThreadPool_isInitialized
assert is initizlied in GlobalStats_GetMultiThreadingStats

* cleanup

* rename workersThreadPool_isCreated

* introduce ConcurrentSearchPool_IsCreated

* fix test

* we dont need workers

* remove ConcurrentSearchPool_IsCreated and workersThreadPool_isInitialized

* fix merge

* [MOD-12789] test: fix flaky thpool test (#7581)

test: fix flaky thpool test

* Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657] (#7508)

* support multiple slot ranges

* implement and move around helpers

* add tests

* ignore shards with no slots

* improve parsing

* better error message

* fix flow tests

* fix tests

* sort by node id

* improve error testing

* cover missing cases

* more error messages improvements

* address AI review

* stabilize Unexpected argument error path

* stabilize more error paths

* fix tests accordingly

* last fix

* add logs to cluster set command

* add more logs per @alonre24 request

* rename MetricIterator to Metric (#7586)

The other iterators are not suffixed with 'Iterator'.

* [MOD-12701] Split the execution of Rust and C/C++ unit tests across two different CI steps. (#7587)

Split the execution of Rust and C/C++ unit tests across two different CI steps.

* [MOD-12519] implement skip multi in II iterators (#7426)

* remove it->skipMulti

No longer needed as the full iterators have been removed.
The query ones always set it to true.

* implement skip multi

II query iterators are supposed to skip results having the same ids.

Test ported from test_cpp_iterator_index.cpp

* inline read() and skip_to()

* test more ii iterator edge cases

Port of GetCorrectValue and EOFAfterFiltering from test_cpp_iterator_index.cpp

* Compress layers prior to exporting them to the Docker layer cache (#7529)

* Compress Docker layers prior to exporting them.

* Ignore boost subfolders recursively

* [MOD-12417] Track maxprefixexpansions errors and warnings in info (#7570)

track maxprefixexpanions

* [MOD-12409]: Port DocumentType enum to Rust (#7590)

Port DocumentType enum to Rust

Port DocumentType enum to Rust as `document::DocumentType`, removing it from `redisearch.h`.
Use `document::DocumentType` in `rlookup` instead of `rlookup::bindings::DocumentType`.

* [MOD-12069] Add `*_pending_jobs` metrics (#7556)

* align info/* to active_coord

* add APIs to get queues length

* add to info

* fix

* test

* fix test

* catch general error

* rename

* fix moduleArgs

* rename

* rename test_active_worker_threads

* rename to wworketrs

* [MOD-12392] Remove numDocs parameter from non-optimized Wildcard iterator (#7602)

Remove numDocs parameter from non-optimized Wildcard iterator

* [MOD-12701] Enforce a per-test timeout in the C++ rstest suite using ctest (#7588)

* Enforce a per-test timeout in the C++ rstest suite using ctest

* Disable problematic tests

* Raise timeout to 60s

* Raise timeout

* Add a timeout for coordinator tests too

* Skip ActivateIoThreadsMetric test

* Register Cursor Sub-Commands as such - [MOD-12807, MOD-12808] (#7571)

* split cursor command

* fix and improve tests

* cover error cases

* fix cursor leaks

* Add "TODO: run hybrid cursor" back

* remove new empty line

* small test improvement

* fix FT.CURSOR GC

* de-flake test

* make CURSOR PROFILE internal only

* test the free

* Keep just the prints

* test on macos and noble as well

* Move the free

* Remove

* Add to non container

* Back to regular test

* Remove spaces

* Moved the remove and add the repo size

* Moved it again

* Move it to after repo build

* Move print after repo build

* test all

* run the temp flow

* add mount to container

* Fix curly

---------

Co-authored-by: Luca Palmieri <[email protected]>
Co-authored-by: Joan Fontanals <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: meiravgri <[email protected]>
Co-authored-by: Guillaume Desmottes <[email protected]>
Co-authored-by: lerman25 <[email protected]>
Co-authored-by: Henk Oordt <[email protected]>
Co-authored-by: alonre24 <[email protected]>
pull bot pushed a commit to Mu-L/RediSearch that referenced this pull request Dec 3, 2025
* Reduce merge queue

* add ability to run manually

* CR comments

* allow workflow call for testing

* Fix naming in build image + remove quick from flow intel as well

* use ubuntu nobel rather than latest

* fix noble typo

* CR fixes

* CR fixes 2

* don't use container for cov and san

* restore leftover

* remove mac + intel and workflow call

* remove macos intel from matrix

* change back ubuntu:latest to ubuntu:noble in merge-to-queue as per Jonathan comment

* measure disk space

* fix step name

* Free disk on container (RediSearch#7613)

* fix: Avoid Rust cache contamination across platforms. (RediSearch#7569)

* [MOD-12170] Implement ASM State Machine on notifications (RediSearch#7331)

* first commit

* fix: handle lock in two phases

* inform ownership in regular command process handlers

* fix: handle proper init

* fix: use conditional variable

* fix: add some comments

* test: add test proving deadlock

* fix: changes as per comments

* fix: remove this locking from conn

* remove code where certainity is not so high

* fix potential issue

* test: add first unit tests version

* test: add proper testing with the value proposition

* test: improve testing

* alternative using auxiliary lock

* simplify thpool a bit

* make sure RedisModule_Yield is protected

* fix: fix issue raised by cursor

* fix: handle potential deadlock in drain also

* fix: fix potential TOCTOU concurrency bug

* fix: fix order of release

* fix: fix potential overflow issue

* fix tests

* fix: fix counts in all variables

* small refactor

* some refactoring of Shared Exclusive Lock

* simplify

* protect GILOwned simple bool

* clarify some comments

* set GILAlternativeLockHeld to true properly

* fix spelling

* add assertion

* fix: fix import RS_LOG_ASSERT

* test: add more conditions to testing

* add some more logic

* improve testing to proper signal main thread can finish while other threads may be waiting for the Shared Lock in the loop

* test: handl test properly

* add another pattern of tests

* fix comments from cursor

* fix concurrency bug

* fix: fix potential race condition at release lock time

* fix: add condition

* test: add more testing

* force testing further to capture more potential errors

* parametrize tests

* test: make tests a little faster

* test: add micro benchmark

* fix compile microbenchmarks

* fix: avoid potential reentrant deadlocks

* fix: avoid potential reentrant deadlocks

* test: avoid leak in test

* test: avoid leak in test

* fix: fix assertion

* fix: fix assertion

* Simplify Shared Lock internals (RediSearch#7267)

* simplify shared lock

* small improvement to set_timeout

* fix comment

* fix nd improve comments

* condition fix

* remove lock type from release API

* Add lock type back to the release API

* remove Unlocked from enum and handle clock init for macOS

* adapt to use new API

* chnange according to comments

* handle PR comments

* handle PR comments

* fix: take shared lock in other cases

* change as PR comments

* test: simplify test, do not allow query errors

* fix tests as per comments

* fix: handle number of high priority jobs running

* fix: fix test comment

* add ASM to help slot tracking in notifications

* test: add some testing idea

* change as per PR comments

* compile and link test fix

* change draining method to drain high priority

* add ctests for ASM State Machine

* test: add ASM tests

* test: complete tests

* change as per PR comments

* add micro benchmarks with jobs in threads

* remove changes not wanted

* checkout redis feature branch in task test

* remove draining

* move atomic to new header

* remove _internal naming

* fix formatting

---------

Co-authored-by: GuyAv46 <[email protected]>

* [MOD-12627] Add Debug Support for `FT.PROFILE` Command (RediSearch#7510)

* imp debug profile for SA:

introduce in mocule.h: RSProfileCommandImp
RSProfileCommand calls RSProfileCommandImp(isDebug = false) for regular execution
ProfileCommandCommand_DebugWrapper mcalls it with isDebug=true and skips _FT.DEBUG

introduce entrypoint for _FT.DEBUG FT.PROFILE in debug_commands:
ProfileCommandCommand_DebugWrapper

RSProfileCommandImp calls DEBUG_execCommandCommon is its debug

_recursiveProfilePrint skips printing debug RP

* pass is debug instead of extracting:

module.h:
replace declaration: DistAggregateCommand
DistSearchCommand
with Imp version that receives isDebug

expose ProfileCommandHandlerImp

align debug_commands

introducr _FT.DEBUG _FT.PROFILE

* add test for cluster

* return res

* augi fixes

* fix spell check

* fix for real

* fix test

* skip tests according to env

* revrt test_profile changes

* reove changes from internal_only

* [MOD-12694] [MOD-12069] Add active_coord_threads metric (RediSearch#7546)

* Add multi-threading statistics tracking for active I/O threads

* fix comment

* add cpp test

* fix spelling

* address comment

* remove unnecessary nre line

* add "active_worker_threads" metric

* fix comment
imp tests

* test cleanups

* add test comne about the num queries and cleanups

* fix declartion

* remove coord threads

* add active_coord_threads

expose ConcurrentSearchPool_WorkingThreadCount

* make the tests run...

* add "active_worker_threads" metric

* fix declartion

* remove coord threads

* make the tests run...

* introduce workersThreadPool_isInitialized
assert is initizlied in GlobalStats_GetMultiThreadingStats

* cleanup

* rename workersThreadPool_isCreated

* introduce ConcurrentSearchPool_IsCreated

* fix test

* we dont need workers

* remove ConcurrentSearchPool_IsCreated and workersThreadPool_isInitialized

* fix merge

* [MOD-12789] test: fix flaky thpool test (RediSearch#7581)

test: fix flaky thpool test

* Support Multiple Slot Ranges in search.CLUSTERSET - [MOD-11657] (RediSearch#7508)

* support multiple slot ranges

* implement and move around helpers

* add tests

* ignore shards with no slots

* improve parsing

* better error message

* fix flow tests

* fix tests

* sort by node id

* improve error testing

* cover missing cases

* more error messages improvements

* address AI review

* stabilize Unexpected argument error path

* stabilize more error paths

* fix tests accordingly

* last fix

* add logs to cluster set command

* add more logs per @alonre24 request

* rename MetricIterator to Metric (RediSearch#7586)

The other iterators are not suffixed with 'Iterator'.

* [MOD-12701] Split the execution of Rust and C/C++ unit tests across two different CI steps. (RediSearch#7587)

Split the execution of Rust and C/C++ unit tests across two different CI steps.

* [MOD-12519] implement skip multi in II iterators (RediSearch#7426)

* remove it->skipMulti

No longer needed as the full iterators have been removed.
The query ones always set it to true.

* implement skip multi

II query iterators are supposed to skip results having the same ids.

Test ported from test_cpp_iterator_index.cpp

* inline read() and skip_to()

* test more ii iterator edge cases

Port of GetCorrectValue and EOFAfterFiltering from test_cpp_iterator_index.cpp

* Compress layers prior to exporting them to the Docker layer cache (RediSearch#7529)

* Compress Docker layers prior to exporting them.

* Ignore boost subfolders recursively

* [MOD-12417] Track maxprefixexpansions errors and warnings in info (RediSearch#7570)

track maxprefixexpanions

* [MOD-12409]: Port DocumentType enum to Rust (RediSearch#7590)

Port DocumentType enum to Rust

Port DocumentType enum to Rust as `document::DocumentType`, removing it from `redisearch.h`.
Use `document::DocumentType` in `rlookup` instead of `rlookup::bindings::DocumentType`.

* [MOD-12069] Add `*_pending_jobs` metrics (RediSearch#7556)

* align info/* to active_coord

* add APIs to get queues length

* add to info

* fix

* test

* fix test

* catch general error

* rename

* fix moduleArgs

* rename

* rename test_active_worker_threads

* rename to wworketrs

* [MOD-12392] Remove numDocs parameter from non-optimized Wildcard iterator (RediSearch#7602)

Remove numDocs parameter from non-optimized Wildcard iterator

* [MOD-12701] Enforce a per-test timeout in the C++ rstest suite using ctest (RediSearch#7588)

* Enforce a per-test timeout in the C++ rstest suite using ctest

* Disable problematic tests

* Raise timeout to 60s

* Raise timeout

* Add a timeout for coordinator tests too

* Skip ActivateIoThreadsMetric test

* Register Cursor Sub-Commands as such - [MOD-12807, MOD-12808] (RediSearch#7571)

* split cursor command

* fix and improve tests

* cover error cases

* fix cursor leaks

* Add "TODO: run hybrid cursor" back

* remove new empty line

* small test improvement

* fix FT.CURSOR GC

* de-flake test

* make CURSOR PROFILE internal only

* test the free

* Keep just the prints

* test on macos and noble as well

* Move the free

* Remove

* Add to non container

* Back to regular test

* Remove spaces

* Moved the remove and add the repo size

* Moved it again

* Move it to after repo build

* Move print after repo build

* test all

* run the temp flow

* add mount to container

* Fix curly

---------

Co-authored-by: Luca Palmieri <[email protected]>
Co-authored-by: Joan Fontanals <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: meiravgri <[email protected]>
Co-authored-by: Guillaume Desmottes <[email protected]>
Co-authored-by: lerman25 <[email protected]>
Co-authored-by: Henk Oordt <[email protected]>
Co-authored-by: alonre24 <[email protected]>

* Revert "Free disk on container (RediSearch#7613)"

This reverts commit 9c68740.

* add free disk step

* rename + remove temp test

* remove leftover

* better readability in container input

Co-authored-by: GuyAv46 <[email protected]>

* fix double defaults

---------

Co-authored-by: dor-forer <[email protected]>
Co-authored-by: Luca Palmieri <[email protected]>
Co-authored-by: Joan Fontanals <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: meiravgri <[email protected]>
Co-authored-by: Guillaume Desmottes <[email protected]>
Co-authored-by: lerman25 <[email protected]>
Co-authored-by: Henk Oordt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants