Skip to content

MOD-11335: Sync master and feature-RRF branch#6862

Merged
nafraf merged 251 commits intofeature-RRFfrom
nafraf_feature-RRF-merge-master
Sep 22, 2025
Merged

MOD-11335: Sync master and feature-RRF branch#6862
nafraf merged 251 commits intofeature-RRFfrom
nafraf_feature-RRF-merge-master

Conversation

@nafraf
Copy link
Collaborator

@nafraf nafraf commented Sep 19, 2025

Merge master branch into feature-RRF.

Changes:

  • The feature branch was updated to support the changes included in Iterators refactor - Replace Old Iterators implementation with the new ones - [MOD-8850] #6572
    • use of new iterators
    • ConcurrentSearchCtx type was removed
  • RLookupRow_Move() was deleted in master, but now we are using it, then it was recovered.
  • metric.c was deleted, its changes were moved to idlist_iterator.c
  • AREQ_Free() was modified because req->rootiter is already freed by Pipeline_Clean() -> QITR_FreeChain() -> rpQueryItFree()

chesedo and others added 30 commits June 11, 2025 13:19
* Migrate `RSAggregateResult` to Rust

This is needed to move `RSIndexResult` to Rust which will make it easier
to port the inverted index to Rust.

* Better rename annotation

* Dummy `RSIndexResult` forward definition

This just makes the C compiler happy when "reading" values of the vector.

* Make fields public

* Better forward declaration

* Migrate `RSResultType` to Rust

This will makes the usages of `RSAggregateResult` more correct inside
the Rust code.

* Prefix enum name correctly

* Cast `int` to `RSResultType`

* Use a bit flag for the mask instead

* Use latest `enumflags2` until new release

The latest code has a fix we need for the Rust code to compile. But it
is not released yet so just using it directly for now.

* Rebase fixes

* Update `enumflags2` to latest release

* Add `enumflags2` to dev dependencies

This is needed to make the doc tests pass since the FFI is going to pull
in a comment which shows how to use `BitFlags`.
fix CI build issues
…ucture (#6250)

* initial rust result processor bindings and infra

* use generated FFI types wherever possible

* address more review comments

* remove MaybeUninit

* remove counter file

* add result_processor to default workspace members

* fixes

* fix CI build issues

* fix doc comments

* fix ffi doctests

* fix tests

* use pin-project instead of pin-project-lite

* address review comments

* fix lint

* address review comments

* remove Paused error

* add tests

* fix stacked borrows unsoundness

* fix unused in tests
* prepare benchmarks binaries

* add micro benchmark workflows

* some refactoring to mix with Rust microbenchmarks

* add benchmark labeler

* add different benchmark mains

* fix runner file name

* fix error in github YML

* change the vars to pass to configure

* try to give permissions

* similar values to hosted template

* try revert back the change on the region

* try micro benchmark runner

* change github token

* try other token

* try token from template

* hardcode region

* try performance access keys

* add new secgroups and subnets

* fix merge mess

* install dependencies

* try another approach

* try noninteractive

* update .github/workflows/flow-micro-benchmarks-runner.yml

* update .github/workflows/flow-micro-benchmarks-runner.yml

* update .github/workflows/flow-micro-benchmarks-runner.yml

* fix stop runner

* add redisearch-micro-benchmarks triggering env

* try adding comparison message to PR

* fix install redisbench admin deps

* try without running

* unblock the comment

* pass arch to redisbench admin

* address PR comments
* Normalize multi-byte char terms

* Fix rm_strdupcase_utf8

* Remove debug log

* Add Russian alphabet and diacritic tests

* Don't use DefaultNormalize_utf8() for chinese

* print available locales

* Check locale before using utf8 normalization

* Fix DefaultNormalize()

* Skip multibyte tests if 'en_US.UTF-8' locale is not available

* Test multibyte stopwords

* Revert changes in install_script.sh

* run sanitizer using ubuntu:latest

* Test_cn: fix language_field

* Remove unused strtolower function from misc.c and misc.h

* Fixes from code review

* Enhance multibyte character tests

* Validate queries using multi-byte stopwords

* Add MULTIBYTE_CHARS config param and install locales

* Use setlocale instead of querylocale for better compatibility

* revert changes in event-pull_request.yml and install French locale for debian/ubuntu

* Convert to lowercase using nunicode library

* nunicode_tolower() returns zero terminated string

* Refactor nunicode_tolower() to improve readability

* Refactor nunicode_tolower() to use destination buffer and improve memory management

* Remove null termination in nunicode_tolower()

* Improve documentation

* Support multi-byte chars for tags

* Support multi-byte chars for synonyms

* unicode_tolower(): avoid duplicating encoded input

* Increase SSO_MAX_LENGTH

* Update test to  use FT.DEBUG DUMP_TAGIDX instead of FT.TAGVALS

* Keep previous test_cn:testSynonym and test_cn:testMixedHighlight

* Update CMake configuration for consistent multi-byte char sorting

* Test JSON index

* Fix JSON test and remove unneeded tolower()

* Fix query_EvalSingleTagNode()

* Refactor string normalization to use case folding instead of lowercase conversion

* Refactor tests to use run_command_on_all_shards for setting DEFAULT_DIALECT

* BWC: Add function to convert string to single codepoint folded runes

* Rename rm_strdupcase() to rm_normalize()

* Add stemming test

* Refactor string normalization to use lowercase transformation instead of folding.
Modify FT.SUGGET flow to be BWC

* Refactor filtering functions to use a transformation callback for rune processing. Remove dead code.

* Update testDFAFilter to use strToSingleCodepointFoldedRunes for rune processing

* Refactor string normalization to use case folding (single codepoint) instead of lowercase transformation

* Revert "Refactor string normalization to use case folding (single codepoint) instead of lowercase transformation"

This reverts commit ebcf994.

* Add case sensitivity option to tag string processing functions

* Rename tag_strtofold to tag_strtolower for clarity and update related references to reflect lowercase transformation

* Simplify length checks in tag_strtolower and rm_normalize functions

* Add documentation after review

* WIP: Enhance unicode_tolower function to handle memory allocation for transformed strings

* WIP: Fixing unicode_tolower usage

* Fix tag_strtolower() to support memory realloc

* skip testToLowerSize() on cluster

* Fix rm_normalize and add more tests

* Fix memory leak

* Fix memory leak 2

* MOD-9609: Pass Private Data Context Back To UnblockClient API (#6237)

* Apparently private data had to be sent back when unblocking the client in order for the free function to get called

* code review comment

* solve memory leak

* MOD-9774: Stop Compiling Redis-With Coverage (#6231)

* stop compiling redis-server with coverage

* bring back coverage as requirement

* Add unit test for unicode_tolower

* Fix memory management in UnicodeToLowerTest and update expected results in multibyte character tests

* Add tests for unsupported Unicode characters in tolower conversion.

* MOD-9768: Add Rust abstractions for Buffer* (#6175)

* Add Rust abstractions for Buffer*

* Add tests

* Make Buffer fields pub

For consistency with BufferReader and BufferWriter. Also there is a very
good chance of us needing to use them directly in the near future.

* Expand the docs a bit more

* Add and improve upon safety comments

* fmt fix

* Add forgotten license headers

* Revert "Make Buffer fields pub"

This reverts commit 1243590.

* Address review comments

* Address review comments

* Better document safety of `Buffer` fields

* Apply suggestions from code review

Co-authored-by: Luca Palmieri <[email protected]>

* Remove an unneeded import

* Use reallocation in mock `Buffer_Grow`

* Satisfy MIRI about some pointer access

* Drop unneeded dep

* Remove unnecessary #includes

* Split FFI into a separate crate

* Add buffer.h API to ffi crate

* Buffer API now based on generated bindings for the C Buffer API

* CI: Set `CARGO_PROFILE_OPTIMISED_TEST_BUILD_OVERRIDE_DEBUG` env

Just acting on the advice from a panic:

```
To improve backtraces for build dependencies, set the CARGO_PROFILE_OPTIMISED_TEST_BUILD_OVERRIDE_DEBUG=true environment variable to enable debug information generation.
```

* Drop linking of internal libs in ffi crate

* Apply suggestions from code review

Co-authored-by: Luca Palmieri <[email protected]>

* Revert "CI: Set `CARGO_PROFILE_OPTIMISED_TEST_BUILD_OVERRIDE_DEBUG` env"

This reverts commit 9bd1d87.

* Drop a redundant import

* Add a few debug_asserts

* Bring back an accidentally deleted assertion in a test

* Fix a .gitignore file

* Make a safety assumption explicit

* Dont build `buffer` crate as static lib

* Don't build bindings for oxidized triemap code

* Make use of BufferWritter::buffer_mut

* Fix build on Alpine

---------

Co-authored-by: Luca Palmieri <[email protected]>

* refactor EmptyIterator (#6240)

* refactor EmptyIterator

* fix useless imports and doc

* update empty_iterator.c to add newline

* add gcc11:bullseye to platforms [MOD-9529] (#6198)

* add gcc11:bullseye to platforms

* add naming

* remove deps

* add nightly platforms option

* set merge queue to false

* fix flow

* fix flow

* chore: Improve granularity of build invalidation logic for FFI bindings (#6244)

* Add `tests/deps/RedisJSON` to `.gitignore` (#6249)

* Skip tests of unsupported unicode 9.0.0 chars

* Fix mem leak 3

* Test stopwords which require memory reallocation

* unicode_tolower update out_len always

* Refactor unicode_tolower function to receive a single  *inout_len argument

* Update documentation for tag_strtolower()

* Fixes after review

* tokenizeTagString: Rename origLen as len

* Test prefix/infix/suffix search with supported unicode chars

* Update test description

* Fix testToLowerConversionExactMatch

* testTextToLowerConversionSimilarMatch: remove unused doc_id in queries

* fix crash in test

* Fix test failure on focal and bullseye: avoid nested f-strings with curly brackets

* Fix test failure on mac: invalid escape sequence

* Use SSO in runesToStr and strToLowerRunes

* Add utf8_len arg to strToLowerRunes()

* Align and move strToLowerRunes comment to header

---------

Co-authored-by: kei-nan <[email protected]>
Co-authored-by: Zeeshan Ali Khan <[email protected]>
Co-authored-by: Luca Palmieri <[email protected]>
Co-authored-by: Joan Fontanals <[email protected]>
Co-authored-by: BenGoldberger <[email protected]>
Co-authored-by: oshadmi <[email protected]>
…Rust definitions (#6257)

* Reduce the amount of unsafe code by using more idiomatic Rust definitions in buffer

* Use a test rather than a const block

* Update src/redisearch_rs/buffer/src/writer.rs

Co-authored-by: Pieter <[email protected]>

* Update src/redisearch_rs/buffer/src/reader.rs

Co-authored-by: Pieter <[email protected]>

* Improve test coverage

* Ignore miri on panicking tests

* Back to const blocks

* Explain what the check is meant to be used for

---------

Co-authored-by: Pieter <[email protected]>
* Rust Varint implementation

This is the Rust re-implementation of the Varint* API. We'll use this
later to replace the C version.

* Add benchmarks for Rust & C Varint implementations

* Don't reimpelement the wheel

* Unify benchmarks

* Longer duration for Varint Vector writer benches

* Slightly longer warmup time for benches

* Rename a benchmark

* Initialize Vec with capacity (just like we do for C)

* Fix benchmarking of C varint decoding

* Changes needed after the rebase

* Fix clippy warnings

* Apply suggestions from code review

Co-authored-by: Pieter <[email protected]>

* Derive `publish` in subcrates' Cargo.toml from workspace

* Remove unused ffi crate dep

* Remove unused internal API

* Ensure Rust implementation produce the same encoding as C

* Remove redundant info

* Don't compile encode_decode as a separate static lib

* Drop unneeded newly added raw functions in C

* Revert un-inlining of Varint decoding functions

* Be fair to the C varint encoders in the benchmarks

* Document varint::VectorWriter::shrink_to_fit's return value

* Add missing newline at the end of .gitignore file

* Always inline the varint encode functions

* Remove an uneeded allocation

* Revert "Don't compile encode_decode as a separate static lib"

This reverts commit 6597daa.

* Reapply "Don't compile encode_decode as a separate static lib"

This reverts commit c7bc844.

* Fix coverage job

* Use a fresh buffer for every iteration of the benchmark

* Fix writing benches

---------

Co-authored-by: Pieter <[email protected]>
Co-authored-by: LukeMathWalker <[email protected]>
* Migrate `RSIndexResult` to Rust

This is the last step to be able to more easily migrate the inverted
index to Rust too.

* Fix tests

* Reduce exports

* Remove `Copy` on types containing raw pointers

The problem with copy on raw pointers is that we might end up with
multiple mutable references to the raw types or just mixed mutable and
immutable references. This is obviously a memory / data race concert
which we don't want.

* Remove `Clone` since it is no longer needed

* Simplify derives on numeric record

Remove things which are no longer needed.
* skeleton for NotIterators refactor with its dependant iterators

* tests added for wildcard (Not Optimized) and Empty Iterator

* rename test files

* NotIterator Common Tests

* test NotIterator with timeout from children and wildcard

* handle test where TimeOut comes from iterator itself and no children

* update license msg

* add micro benchmarks for NotIterator

* add microbenchmark for NotIterators

* fix the benchmark setting

* apply suggestions from code review

Co-authored-by: GuyAv46 <[email protected]>

* first set of reviews

* fix microbenchmarks

* fix algo for ReadOptimized

* improve PR a little

* SkipTo from childIds properly tested

* SkipTo tests ready

* tests and microbenchmarks working

* fix and better comments

* some refactoring to clarify code

* simplify code a lot

* improve benchmarks

* test: add more tests to increase confidence

* add more cases

* add some more tests and benchmarks

* remove old test

* fix sanitizer

* improve code coverage

* add a new param to micro benchmark to test only with the subset condition

* separation of NonOptimized to simplify PR

* change ReadNotOptimized

* comment the Optimized micro benchmarks

* address some comments in PR

* use Init to unify logic

* apply suggestions from code review

Co-authored-by: GuyAv46 <[email protected]>

* more changes due to PR review

* changing testing self timeout trigger

* simplify test matric

* simplify benchmark scenarios

* skip test

* test maxDocId smaller than maxChildId

---------

Co-authored-by: GuyAv46 <[email protected]>
* move the lock to include the opening of the indexes

* add pytest

* try to fix test

* fix test

* move lock sooner

* spell checker

* try to fix test

* pr changes

* pr changes

* remove debugPrint in test

* remove debug prints

* add gc invokes while deleting docs

* add timeout before read

* add the timeout to gc struct

* change timeout to 3 min

* try to change the while

* switch to poll

* refactor gc struct

* add gc_wait in test

* pr changes

* pr changes

* pr changes

* pr change

* try to fix leak in test

* revert priority change

* remove deps changes

* pr changes

* revert len check
* Remove unnecessary parameter

* Replace direct field access with inline functions

* update C api for from one variant function to put values to three type safe functions

* fix: possible null deref

* address review comment -> use _inl postfix to distinguish functions

* use non inline function

* expose RSValue_IsNull, remove inline qualifiers for benchmark

* remove check in GetLength

* rename to reduce changed lines
* initial refactor

* improve implementation

* minor simplifications

* implement NewInvIndIterator API

* fixes after rebase

* fix headers

* add tests and fix test build

* fix test

* fix mock constructors

* fix build

* add a benchmark

* address review comments

* more improvements

* Optional Iterator refactor - [MOD-8844]

It contains the code for the Optional Iterator refactor.

Implementation has been taken from https://github.com/RediSearch/RediSearch/pull/5619/files

* test to use mock

* fix benchmark test

* improve benchmark fix

* pr changes

* Optional Iterator refactor - [MOD-8844]

It contains the code for the Optional Iterator refactor.

Implementation has been taken from https://github.com/RediSearch/RediSearch/pull/5619/files

* test to use mock

* pr changes

* pr changes

* add old iterator benchmarks

* pr change

* initial refactor

* improve implementation

* minor simplifications

* implement NewInvIndIterator API

* fixes after rebase

* fix headers

* add tests and fix test build

* fix test

* fix mock constructors

* fix build

* add a benchmark

* address review comments

* more improvements

* fix benchmark test

* improve benchmark fix

* Optional Iterator refactor - [MOD-8844]

It contains the code for the Optional Iterator refactor.

Implementation has been taken from https://github.com/RediSearch/RediSearch/pull/5619/files

* test to use mock

* pr changes

* test to use mock

* pr changes

* pr changes

* add old iterator benchmarks

* pr change

* move mockqueryctx

* improve implementation

* implement NewInvIndIterator API

* address review comments

* Optional Iterator refactor - [MOD-8844]

It contains the code for the Optional Iterator refactor.

Implementation has been taken from https://github.com/RediSearch/RediSearch/pull/5619/files

* test to use mock

* pr changes

* test to use mock

* pr changes

* pr changes

* add old iterator benchmarks

* pr change

* initial refactor

* improve implementation

* minor simplifications

* implement NewInvIndIterator API

* test to use mock

* pr changes

* Optional Iterator refactor - [MOD-8844]

It contains the code for the Optional Iterator refactor.

Implementation has been taken from https://github.com/RediSearch/RediSearch/pull/5619/files

* test to use mock

* pr changes

* move mockqueryctx

* remove unwanted changes

* pr changes

* cleanup and pr changes

* remove deps

* pr changes

* fix test

* fix mockiterator constructors

* fix test

---------

Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
* implement iterator

* refine API

* implement tests

* implement benchmark

* cleanup after EOF/error

* simplify benchmark

* fix benchmark

* improve benchmark setup

* API cleanup and documentation

* add more tests

* address copilot review

* fix sanitizer build and improve tests

* fix test

* address CR and some additional refactors

* test improvement

* minor fix (set current before checking it)

* address CR suggestions

* test cleanup

* verify the children are sorted

* add a comment
* fix microbenchmark link in comment

* run microbnchmarks nighly

* fix the problem when pushing to integ

* remove notify failure
* skeleton for NotIterators refactor with its dependant iterators

* tests added for wildcard (Not Optimized) and Empty Iterator

* rename test files

* NotIterator Common Tests

* test NotIterator with timeout from children and wildcard

* handle test where TimeOut comes from iterator itself and no children

* update license msg

* add micro benchmarks for NotIterator

* add microbenchmark for NotIterators

* fix the benchmark setting

* apply suggestions from code review

Co-authored-by: GuyAv46 <[email protected]>

* first set of reviews

* fix microbenchmarks

* fix algo for ReadOptimized

* improve PR a little

* SkipTo from childIds properly tested

* SkipTo tests ready

* tests and microbenchmarks working

* fix and better comments

* some refactoring to clarify code

* simplify code a lot

* improve benchmarks

* test: add more tests to increase confidence

* add more cases

* add some more tests and benchmarks

* remove old test

* fix sanitizer

* improve code coverage

* add a new param to micro benchmark to test only with the subset condition

* fix tests

* update call to Optimized WCIterator

* fix compilation

* iteration of PR comments, bool and no EOF at timeout

* small refactor avoid LC_COV comment

* small optimization when child is already at EOF

* get rid of helper method

* simplify optimized SkipTo

* avoid calling Read on child first and improve readibility

---------

Co-authored-by: GuyAv46 <[email protected]>
Co-authored-by: GuyAv46 <[email protected]>
Fix parser for negation of unexistent function
* WIP: Implement func_case

* Add tests

* Test nested case

* Rename testNestedCase to testNestedCaseAndGroup

* Add test for invalid apply expressions

* Remove TODO comment

* Fix flaky test

* Test with document which doesn't have neither of the fields.

* Add unit tests

* Add tests for NULL and missing fields

* Add more unit tests

* refactor evalFuncCase and update test cases for func_case handling

* Fix typo

* Refactor testCaseWithLogicalOperators and testCaseWithMissingFields to improve product classification and result validation

* Add test with deep nesting of case

* One more simple test

---------

Co-authored-by: oshadmi <[email protected]>
* initial rust result processor bindings and infra

* use generated FFI types wherever possible

* address more review comments

* remove MaybeUninit

* remove counter file

* add result_processor to default workspace members

* fixes

* fix CI build issues

* fix doc comments

* fix ffi doctests

* fix tests

* use pin-project instead of pin-project-lite

* address review comments

* fix lint

* address review comments

* remove Paused error

* add tests

* fix stacked borrows unsoundness

* fix unused in tests

* readd counter

* rebase fixes

* rebase fixed & replace pin-project-lite

* fix license comments

* rebase

* define mock SeachResult_Clear for test

* cleanup

* rename header file

* remove allows

* drop cbindgen configs

* add tests for FFI function

* fix typo

* use NonNull instead of *mut

* add additional type assertion

* fix tests

* update comment

* cleanup search result

* fix cfg

* move tests into lib.rs

* add license header

* fix tests checks

* update comments
…6345)

* MOD-10187 refactor C interface

* fix

* remove flag

* actually use correct flags

* Update src/rlookup.h

Co-authored-by: Raz Monsonego <[email protected]>

---------

Co-authored-by: Raz Monsonego <[email protected]>
* add optimized implementation

* build fixes

* fix assertion

* small bug fix

* some initial tests

* fix mock constructor

* test improvements

* add another test

* add benchmarks

* revert temporary benchmark setup

* assume valid input (add assertions)
* fix TTL dictionary

* add a failing test

* delete dummy test
remove arm from amzn2 + use test only for debian with gcc11
* Get a basic Rust benchmark working for the C code

This creates a Rust benchmark runner for the C numeric encoder and
decoder only. This will serve as a base for more encoder / decoder benchmarks.

* Don't make FFI for C buffer

The benchmarks are using the Rust wrapper instead.

* Make it easier to construct a buffer

* Prep for testing all the basic numeric inputs

* Test all the numeric encoding types

* Complete `Debug` and `PartialEq`

* Wrap up inverted index module

* Clean up the build file

* Finish up the C file changes

* Fix FFI include paths

This means `numeric_index.h` no longer needs a path to include `inverted_index.h`.

* Benchmark the encoding for all the types

* Use a bigger delta

This is make the benchmark more accurate.

* Benchmark numeric decoding

* Add missing license headers

* Add missing safety docs

* Fix includes for tests

* Fix spelling

* Add a README

* Rename `reset` to `clear`

* Fail if static library cannot be linked

* Use a buffer wrapper

This allows us to not worry about growth allocations and correctly
clears the memory after the buffer was used.

* Use `iter_batched`

* Remove the main

We won't need this as the output of the C and Rust should be exactly the
same. So there is no memory usage to compare.

* Fix CPP include

Missed fix for new file coming from master.

* Fix comment

* Better neg infinity constant

* Better name for buffer wrapper

* Use small input for batch size

* Ignore bencher during coverage

* Use `with_capacity` instead

We should not want to grow the buffer for the inverted index tests and
benchmarks since that belongs in the buffer code. Instead we should
always make a buffer that is big enough for whatever is encoded into it.
This removes the `new` and add a `with_capacity` to make sure we always
use the test buffer in this way.
* run against vecsim main

* update vecsim tp point to main

* Update VectorSimilarity submodule

* bump vecsum to latest

* repalce image for amzn2 to gcc1 on arm

* update names

* fix tag

* remove devtoolset-10 from install script

* rename to meiravgrimberg937/gcc11:slim and install stuff

* install make

* update to gcc12-amzn2:slim

* install yum groupinstall "Development Tools" -y

* try old installato=ion without gcc

* install binutils
print version in task test

* fix

* move binutils installation to x86

* cd out

* update vecsim

* use ghcr.io/redisearch/gcc12-amzn2:slim

* use credentials only if necessary

* use nukk

* fix

* try

* revert

* change comptiable to 8.0

* change comptiable to 8.0

* revert unnecessary changes

* revert also .gutsubodule
* Updated deps/VectorSimilarity to main:fa8c459 (Jun'15)

* Support for SVS library integration

* Added support for SVS library VECSIM_ALGOS in pytests

Verified & Passed tests: test_sanity_cosine, test_sanity_l2, test_sanity_zero_results, test_ft_aggregate_basic, test_range_query_basic

* Added basic support for SVS_TIRED, LEANVEC

* Minor changes and review suggestions addressing
- Fixes for typos
- Update deps/vecsim to latest
- Changed naming for SVS (to SVS-VAMANA) and multiple SVS Params
- Removed block_size from svs params processing
- Check for appropriate date type for SVS-VAMANA
- Added a validation to keep NUM_THREADS within MAX_WORKER_THREADS
- Removed redundant VecSimAlgo_SVS blocks

* Added training_threshold parameter for SVS-VAMANA

* Fixes for review

- removed extra line
- removed redundant define
- fixed compresion values
- fixed svs self.type in tests for better code
- fixed typo in info_command

* Fixes for review & new tests added
- Updated test test_vecsim:test_create_errors for SVS-VAMANA
- Excluded SVS-VAMANA from test test_create_multi_value_json, test_default_block_size_and_initial_capacity
- Default parameters for SVS-VAMANA are setup explicitly
- Extra checks for TRAINING_THRESHOLD, related test
- Processing for absent quantBits value is added
- Fix for QueryError function call
- Removed MAX_CANDIDATE_POOL_SIZE and PRUNE_TO from info

* Naming fix: QuantBits -> Compression
Fix for TRAINING_THRESHOLD related error message
Added processing for VECSIM_LVQ_SCALAR compression case
Added skip for SVS-VAMANA in test_multithread
Updated test_sanity_cosine,test_sanity_l2 with case COMPRESSION,LVQ8 for SVS-VAMANA
Typo fix

* Added tests for SVS-VAMANA to test_create_errors
Typo fix

* Update deps to latest (f8bc695 Add Clang support to SVS)
* handle zero size in rust allocator shim instead of panic

* add tests
* Replace C implementation of varint with Rust implementation

This is the Rust re-implementation of the Varint* API. We'll use this
later to replace the C version.

* Improve documentation

* Address review comments

* Update src/redisearch_rs/c_entrypoint/varint/Cargo.toml

Co-authored-by: Pieter <[email protected]>

* Don't mention correctness in README.md

* Return raw pointer to C

---------

Co-authored-by: LukeMathWalker <[email protected]>
Co-authored-by: Pieter <[email protected]>
@codecov
Copy link

codecov bot commented Sep 19, 2025

Codecov Report

❌ Patch coverage is 92.49305% with 162 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (feature-RRF@780b597). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/inverted_index/inverted_index.c 82.67% 79 Missing ⚠️
src/index_result.c 74.00% 13 Missing ⚠️
src/ext/default.c 89.00% 11 Missing ⚠️
src/coord/rmr/conn.c 89.39% 7 Missing ⚠️
src/iterators/hybrid_reader.c 95.41% 6 Missing ⚠️
src/coord/rmr/cluster.c 89.58% 5 Missing ⚠️
src/coord/rmr/rq.c 70.58% 5 Missing ⚠️
src/byte_offsets.c 69.23% 4 Missing ⚠️
src/coord/config.c 80.00% 4 Missing ⚠️
src/debug_commands.c 92.72% 4 Missing ⚠️
... and 12 more
Additional details and impacted files
@@              Coverage Diff               @@
##             feature-RRF    #6862   +/-   ##
==============================================
  Coverage               ?   86.48%           
==============================================
  Files                  ?      301           
  Lines                  ?    48418           
  Branches               ?     9399           
==============================================
  Hits                   ?    41875           
  Misses                 ?     6392           
  Partials               ?      151           
Flag Coverage Δ
flow 84.60% <84.66%> (?)
unit 51.47% <60.47%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@oshadmi oshadmi requested a review from Copilot September 20, 2025 19:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR merges master into feature-RRF, re-aligning the feature branch with upstream iterator, indexing, and stats changes. Key areas touched include iterator rewrites (new revalidation logic, reduction optimizations, added profiling iterator), inverted index refactors (new reader abstraction, expanded GC delta API), metric and global stats time unit changes (clock ticks → nanoseconds), new vector algorithm (SVS), and API adjustments around yielding and memory reporting.

  • Major refactor of iterator APIs: introduction of Revalidate, reducer optimizations (union/intersection/optional/not), wildcard & inverted index reader redesign.
  • Inverted index restructuring: accessor/setter API, GC delta application, encoder/decoder wrappers, index summary utilities; removed legacy macros/functions.
  • Index/result API changed but dependent code still calls removed constructors and helpers (breaks build).

Reviewed Changes

Copilot reviewed 169 out of 511 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/legacy_types.h / .c Adds legacy type registration & RDB load “consume” handlers.
src/json.c Adds SVS vector algo handling and more explicit error messages; minor logic issue (duplicate break).
src/iterators/* (multiple) Refactors iterator construction, adds reducers, Revalidate paths, profile iterator, weight propagation; still using removed result APIs.
src/inverted_index/* Large redesign: reader abstraction, encoder/decoder exposure, GC delta APIs, accessor functions; updates stats & memory helpers.
src/index_result.h / .c Replaces prior aggregate/result API with slimmer metric & reset helpers; removes constructors & aggregate helpers (callers not updated).
src/info/* Adds SVS/Vamana stats, switches timing to wall-clock ns, adds oom policy, adjusts memory aggregation.
src/indexer.* Changes yield API signature & logic (params added), removes some prior yield call sites.
Comments suppressed due to low confidence (1)

src/iterators/hybrid_reader.c:1

  • NewMetricResult and NewHybridResult are removed from index_result.h; continued usage will cause unresolved symbol errors. Replace with the new creation mechanism for metric / hybrid aggregate results.
/*

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

chesedo and others added 4 commits September 21, 2025 05:50
* Provide a fake `isWithinRadius` for the benchmarks

* Provide a fake for `DocTable_Exists`

* Turn numeric FFI test off
…nitialization (#6817)

* initial commit

* prepare code for CI run

* move terms and fieldIdToIndex initialization to  initializeIndexSpec

* cleanup test a bit

* remove extra RWlock init (#6823)

* fix code review comments

* remove print from mock

* remove print from mock functions

* code review comment

* remove log remainder

* switch rm_calloc arg order

* add initializeFieldSpec function

---------

Co-authored-by: Joan Fontanals <[email protected]>
Fix build script logic for Intel SVS option handling
* MOD-11173: Fix unreachable links in markdown files

* Add link-check script and workflow for markdown files

* Use action upload-artifact@v4

* Use system-wide uv install

* Fix copilot review + add debug for missing file

* Fix failed local url, remove debug code

* Trigger link-check using a label "check-links"

* Fix URL encoding in README.md which fail on CI

* Code review fixes
@nafraf
Copy link
Collaborator Author

nafraf commented Sep 21, 2025

Copilot comments were resolved those comments were not related to changes included by feature-RRF branch, the files were identical to master branch.

@nafraf nafraf requested a review from oshadmi September 22, 2025 01:31
JonasKruckenberg and others added 5 commits September 22, 2025 12:27
Fix build script logic for Intel SVS option handling
* MOD-11173: Fix unreachable links in markdown files

* Add link-check script and workflow for markdown files

* Use action upload-artifact@v4

* Use system-wide uv install

* Fix copilot review + add debug for missing file

* Fix failed local url, remove debug code

* Trigger link-check using a label "check-links"

* Fix URL encoding in README.md which fail on CI

* Code review fixes
@nafraf nafraf requested a review from JoanFM September 22, 2025 14:09
Copy link
Collaborator

@oshadmi oshadmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏆

@JoanFM
Copy link
Collaborator

JoanFM commented Sep 22, 2025

Let's move on with this, and focus the review effort when the RRF tries to merge to master, otherwise high risk of re-reviewing unrelated code

@nafraf nafraf merged commit 4a43d19 into feature-RRF Sep 22, 2025
19 checks passed
@nafraf nafraf deleted the nafraf_feature-RRF-merge-master branch September 22, 2025 14:45
github-merge-queue bot pushed a commit that referenced this pull request Sep 23, 2025
* MOD-10030: Add background depleter result processor (#6325)

Add RPDepleter

* Sync feature branch with master (#6373)

Sync feature branch with master

* Prevent CI breakage when Rust 1.88.0 is released (#6375) (#6389)

* MOD-10031: RPHybridMerger (#6317)

* MOD-10250: RRF RESP3 format response (#6388)

Add pipeline activation and result serialization functions for `FT.HYBRID`

* MOD-10299: Wrap Common AREQ Members Usage With Functions (#6390)

* wrap common areq members usage with functions

* fix compilation

* small fixes

* revert some small changes

* use AREQ_RequestFlags in a few other places

* additional changes

* minor changes

* Apply suggestions from code review

use right address

* code review comments

* MOD-10301: Add index locking/freeing mechanism for index-space consistency (#6413)

* Add index locking/freeing mechanism for index-space consistency

* Verify lock is released at the end of the depleter's logic

* Fix locking functions

* Set take-lock = false

* Address review and add tests with locking mechanism

* Add tests with index synchronization

* Move released var to sync

* Add important note

* Remove leftovers

* Address review

* MOD-10254: Separate Aggregation Pipeline From AREQ (#6399)

* wrap common areq members usage with functions

* fix compilation

* small fixes

* revert some small changes

* use AREQ_RequestFlags in a few other places

* additional changes

* minor changes

* initial commit

common initialization

use functions to access and manipulate common areq members

use local variables to reduce duplicate function calls

compilation fixes

revert line

remove hybrid request

* fix test

* add AggregationPipeline_Free

* move more code to AggregationPipeline_Free

* fix compilation + max search result

* code review comments

* fix compilation

* address code review comments, better pipeline and params separation for pipeline construction

* fix type name

* remove comment

* add cmake changes

* bring back old clean logic

* move AREQ_BuildPipeline to original position in header

* fix compilation post merge

* fix some tests seg faults

* remove hybrid request, not supposed to be in this branch

* add missing newlines

* bring back missing freeing of areq members

* minor api changes

* copy the context before releasing the loader

* add comments

* code review comments

* code review - rename

* accidently added hybrid request - removing

* [feature-RRF] MOD-10224: Add vector filter validation  (#6487)

Apply PR 6376 to feature-RRF branch

* MOD-10303: Hybrid Request (#6404)

* wrap common areq members usage with functions

* use AREQ_RequestFlags in a few other places

* common initialization

* use functions to access and manipulate common areq members

* use local variables to reduce duplicate function calls

* add AggregationPipeline_Free

* move more code to AggregationPipeline_Free

* fix compilation + max search result

* address code review comments, better pipeline and params separation for pipeline construction

* add cmake changes

* bring back old clean logic

* move AREQ_BuildPipeline to original position in header

* fix compilation post merge

* fix some tests seg faults

* remove hybrid request, not supposed to be in this branch

* add missing newlines

* bring back missing freeing of areq members

* minor api changes

* copy the context before releasing the loader

* add comments

* add hybrid request

* add loader to areq + tests

* add cmake changes

* align hybrid request api

* fix tests - use the AGGPLN inside the hybrid request for consistent linke list pointer comparison

* remove RLookup_CloneKey API

* remove pln member from params

* fix memory leak

* Moved loadDtor from aggregate_request to aggregate_plan and remove duplicate implementation of it

* Remove redundant VERBATIM flag setting in CreateTestAREQ function

* Add TODO comments for sorter

---------

Co-authored-by: [email protected] <[email protected]>

* MOD-10041 Parse FT.HYBRID command (#6428) (#6551)

* Initial FT.HYBRID parsing

* Fix parseSearchSubquery compilation error

* Fix search parsing and endless loop in error handling

* Improve cpp tests argv creation and fix

* Parse tail AGGPlan & rebase

* All tests pass

* Add database selection functions to redismock

* Fix parseCombine() memory leak

* Add HybridPipelineParams* to HybridRequest  struct

* Fix AggPlan handling in parser

* Update copyright notice and licensing information in parse_hybrid files

* Change AREQ pipeline to pointer

* Apply context to vector requests and refactoring assertions in tests

* Copy reqConfig to searchRequest and vectorRequest.

* Cleanup

* MOD-10303: Hybrid Request (#6404)

* wrap common areq members usage with functions

* use AREQ_RequestFlags in a few other places

* common initialization

* use functions to access and manipulate common areq members

* use local variables to reduce duplicate function calls

* add AggregationPipeline_Free

* move more code to AggregationPipeline_Free

* fix compilation + max search result

* address code review comments, better pipeline and params separation for pipeline construction

* add cmake changes

* bring back old clean logic

* move AREQ_BuildPipeline to original position in header

* fix compilation post merge

* fix some tests seg faults

* remove hybrid request, not supposed to be in this branch

* add missing newlines

* bring back missing freeing of areq members

* minor api changes

* copy the context before releasing the loader

* add comments

* add hybrid request

* add loader to areq + tests

* add cmake changes

* align hybrid request api

* fix tests - use the AGGPLN inside the hybrid request for consistent linke list pointer comparison

* remove RLookup_CloneKey API

* remove pln member from params

* fix memory leak

* Moved loadDtor from aggregate_request to aggregate_plan and remove duplicate implementation of it

* Remove redundant VERBATIM flag setting in CreateTestAREQ function

* Add TODO comments for sorter

---------



* CR: Add docs for a few functions

* CR: extract HYBRID_REQUEST_NUM_SUBQUERIES constant

* Fix pipeline reference in parseQueryArgs and AREQ_BuildPipeline functions

* Refactor to don't use AREQ to parse tail

* Refactor pipeline handling to use direct struct access instead of pointers

* Remove unused test util

* Revert "Remove unused test util"

This reverts commit 7130515.

* Revert "Revert "Remove unused test util""

This reverts commit d7e54d2.

* Remove multiline request test

This test only tested ArgvList initialization with multiline command

* Revert AREQ_AddRequestFlags() and AREQ_RemoveRequestFlags() changes

* Fix maxResultsLimit assignment logic

* Update HybridRequest_Free and tests to reflect scoring context is freed by hybrid merger

* Add test for valid input with parameters

* Implement context cleanup scheduling for background thread processing

---------

Co-authored-by: nafraf <[email protected]>
Co-authored-by: kei-nan <[email protected]>
Co-authored-by: [email protected] <[email protected]>

* MOD-10249 Hybrid implicit load (#6529)

* Implement automatic @__key and @__score return for hybrid search (MOD-10249)

- Add CreateImplicitLoadStep() function to create implicit LOAD with @__key and @__score fields
- Modify HybridRequest_BuildPipeline() to detect missing LOAD and add implicit one after merge operation
- Implicit LOAD is added only to tail pipeline for final merged results, not individual request pipelines
- Add proper RLookupKey creation following existing test helper patterns
- Add 2 focused tests following existing naming conventions:
  - testHybridRequestImplicitLoad: verifies implicit LOAD creation
  - testHybridRequestExplicitLoadPreserved: ensures explicit LOAD steps are preserved
- Update baseline test expectations to match correct behavior (no loaders in individual pipelines)
- All tests passing, implementation complete and ready for use

Implements PRD requirement 1.12: automatic doc_id and combined_score return when no LOAD field specified

* Fix hybrid search implicit LOAD field names

Change from @__key/@__score to doc_id/combined_score

* Remove empty lines

* Remove redundant test

* Fix hybrid search implicit LOAD field names

- Change from @__key/@__score to doc_id/combined_score
- Add HYBRID_IMPLICIT_DOC_ID and HYBRID_IMPLICIT_COMBINED_SCORE constants
- Update RLookupKey mapping and test comments
- Preserves both automatic and explicit LOAD behaviors

All tests passing.

* Fix implicit LOAD args setup for proper field name representation

- Add both doc_id and combined_score fields to implicit LOAD step
- Set up ArgsCursor with proper field names for serialization and display
- Use static array for args.objs to avoid memory management issues with loadDtor
- Include rmutil/args.h for ArgsCursor type definitions

All tests passing.

* Fix lookup to first

Last is after groupby steps, if exist. Currently it's not supported in hybrid subqueries so this was not noticed.

* Move implicit load from tail to hybrid subqueries

* Add implicit score if no LOAD was supplied

* Fix memory leak

* Rename key field

* CR: Remove comments and extend tests

* Add scoreKey validation for HybridMerger

* Fix merge

* MOD-10549 Hybrid implicit SORTBY (#6552)

* Implement automatic @__key and @__score return for hybrid search (MOD-10249)

- Add CreateImplicitLoadStep() function to create implicit LOAD with @__key and @__score fields
- Modify HybridRequest_BuildPipeline() to detect missing LOAD and add implicit one after merge operation
- Implicit LOAD is added only to tail pipeline for final merged results, not individual request pipelines
- Add proper RLookupKey creation following existing test helper patterns
- Add 2 focused tests following existing naming conventions:
  - testHybridRequestImplicitLoad: verifies implicit LOAD creation
  - testHybridRequestExplicitLoadPreserved: ensures explicit LOAD steps are preserved
- Update baseline test expectations to match correct behavior (no loaders in individual pipelines)
- All tests passing, implementation complete and ready for use

Implements PRD requirement 1.12: automatic doc_id and combined_score return when no LOAD field specified

* Fix hybrid search implicit LOAD field names

Change from @__key/@__score to doc_id/combined_score

* Remove empty lines

* Remove redundant test

* Fix hybrid search implicit LOAD field names

- Change from @__key/@__score to doc_id/combined_score
- Add HYBRID_IMPLICIT_DOC_ID and HYBRID_IMPLICIT_COMBINED_SCORE constants
- Update RLookupKey mapping and test comments
- Preserves both automatic and explicit LOAD behaviors

All tests passing.

* Fix implicit LOAD args setup for proper field name representation

- Add both doc_id and combined_score fields to implicit LOAD step
- Set up ArgsCursor with proper field names for serialization and display
- Use static array for args.objs to avoid memory management issues with loadDtor
- Include rmutil/args.h for ArgsCursor type definitions

All tests passing.

* Fix lookup to first

Last is after groupby steps, if exist. Currently it's not supported in hybrid subqueries so this was not noticed.

* Move implicit load from tail to hybrid subqueries

* Add implicit score if no LOAD was supplied

* Fix memory leak

* Rename key field

* CR: Remove comments and extend tests

* Add implicit sort-by-score for hybrid search and refactor test boilerplate

- Add implicit sort-by-score step in HybridRequest_BuildPipeline when no explicit SORTBY is provided
- Ensures hybrid search results are automatically sorted by computed hybrid scores (descending order)
- Only adds implicit sorting when AGPLN_FindStep finds no PLN_T_ARRANGE step in tail pipeline
- Uses maxResultsLimit from aggregation params, falls back to DEFAULT_LIMIT if not specified
- Follows same pattern as regular search pipelines for consistent behavior

Test improvements:
- Extract boilerplate setup/cleanup code into reusable helper methods in HybridRequestTest class
- Reduce test code size by 23% (743 → 568 lines) while maintaining full functionality
- Add automatic resource management with tracked cleanup in TearDown()
- Improve test readability by focusing on test logic rather than setup boilerplate
- Add comprehensive helper methods for hybrid request creation, scoring contexts, and pipeline verification
- Fix TODOs in existing tests to reflect new implicit sort-by-score functionality

* Add scoreKey validation for HybridMerger

* Fix merge

* Add SORTBY 0 to disable implicit sorting

* revert change

* Fix SORTBY 0 and add test

* CR: Removed unused code

* CR: Removed wrong comment

* CR: Preserve FT.AGGREGATE behavior with SORTBY 0

We introduce new IsHybrid request flag for that, and apply the new
behavior only if it's a hybrid request

* CR: Comments

* CR: Replace RP create with AGG Plan

* Fix tests

* MOD-10226: Implement parser for Hybrid Query VECTOR query params (#6482)

* MOD-10633 Add scorer and sorter by setting search flag for first subquery (#6583)

* Add scorer and sorter by setting search flag for first subquery

* Change search subquery type to Hybrid

* Fix missing parenthesis

Co-authored-by: Raz Monsonego <[email protected]>

* Fix missing parenthesis

* Clarify query type access

`reqflag & (QEXEC_F_IS_SEARCH | QEXEC_F_IS_HYBRID)` might be confusing because it suggests a query can have mixed type with the 2 flags set at the same time.
 Using `IsSearch(req) || IsHybrid(req)` to better convey that a request is either Hybrid or Search.

* Change type to Hybrid Search Subquery

Introducing a new separate type for Search Subquery of Hybrid.
Hybrid type is still left for the tail pipeline construction and output.

* Rename Hybrid query type to Hybrid Tail

* Fix bug

* Update src/aggregate/aggregate.h

Co-authored-by: nafraf <[email protected]>

* CR: Fix bug

* Fix flow test

---------

Co-authored-by: Raz Monsonego <[email protected]>
Co-authored-by: nafraf <[email protected]>

* MOD-10533: Sync RLookups - tail and upstreams  (#6612)

* MOD-10835 Fix SORTBY 0 by not creating Arrange AggPlanStep (#6635)

Fix SORTBY 0 by not creating Arrange AggPlanStep

* MOD-10550 MOD-10645 Window / Limit defaults (#6610)

* Add default Window/Limit handling

Also fix rrf k parsing to support decimal values.

* Remove unneeded comments

* Fix memory leaks in hybrid search: properly free HybridScoringContext

- Fix HybridRequest_Free to properly call HybridScoringContext_Free
- Remove manual HybridScoringContext_Free calls from all ParseHybridTest tests
- Establish proper ownership: HybridRequest owns and cleans up its scoring context
- Resolves use-after-free errors and memory leaks detected by AddressSanitizer
- All 52 ParseHybridTest and 13 HybridDefaultsTest now pass with ASAN enabled

* Update src/hybrid/parse_hybrid.c

Co-authored-by: Raz Monsonego <[email protected]>

* Remove accidentally committed files (TODO.md and CMakeLists.txt)

* Update src/hybrid/parse_hybrid.c

Co-authored-by: Raz Monsonego <[email protected]>

* CR: fix default RRF K, Add a test

* Add KNN K validation on Defaults tests

* Remove scope creeping test

I'm not sure whats the expected behavior on LIMIT 0 0 on Hybrid. counting results seems non obvious (should we count results that appear in both subqueries once or twice?) and missing the purpose of this command.
I'm removing this test and if this requires further handling lets make it on separate task to avoid scope creep

* Remove originalWeights

Also make num of weights parameter to parseCombine, preparing for multi vectors in Hybrid

* Move window back only to RRF

* fix CR comments

* CR cosmetics

- remove malloc failure check
- rename explicit load search for clarity

* Apply KNN K <= WINDOW constraint internally

* Refactor

* Fix Tests Refactor

* Fix Tests

* Another test refactor to extract boilerplate

* Trying to fix leaks

* Trying to fix leaks

* Move HybridScoringContext ownership from request to Hybrid Merger

* Release scoring ctx on parsing tests

Now that Merger RP owns scoring ctx and HybridRequest doesn't free it,
release it manually in parsing tests as the merger is not created there

* Remove redundant test cleanup

* Remove Merger dependency on parent RP

* SIZE_T_MAX not defined, use SIZE_MAX instead

---------

Co-authored-by: Raz Monsonego <[email protected]>

* MOD-10533: Merge SearchResults (#6579)

* MOD-10408: Hybrid command handler - part 1 (#6639)

* MOD-10227: Disable tag scoring for FT.HYBRID (#6707)

* MOD-11012: Fix VSIM filter not working in hybrid queries  (#6708)

* MOD-11132: Fix direct blob issue in FT.HYBRID (#6744)

* MOD-10407: Handle warnings and errors of the subquery pipelines (#6737)

* MOD-11140: Implement range vector subquery filter as intersect (#6751)

* MOD-10050: Add hybrid search flow tests - part 1 (#6711)

Add tests utils for FT.HYBRID flow tests and add tests for FT.HYBRID + KNN + RRF.

* MOD-11047: Rename RRF K parameter to CONSTANT (#6745)

* MOD-11144: Fix FT.HYBRID GROUPBY bug (#6776)

* MOD-10047: Prepare hybrid parsing for cursors (#6770)

* MOD-11049: Use BM25STD.NORM for text subquery for LINEAR fusion (#6789)

* Revert BM25STD.NORM as default (#6822)

* Revert "MOD-11049: Use BM25STD.NORM for text subquery for LINEAR fusion (#6789)"

This reverts commit 5d41d9e.

* Restore the explicit and implicit scorer tests for FT.HYBRID

* newline EOF

* Add flow test for FT.HYBRID with explicit scorer

* newline EOF

* MOD-11276, MOD-11043: Reconstruct RlookupRow when merging (#6801)

* Implement RLookup_AddKeysFrom and RLookupRow_TransferFields functions with corresponding tests

* Remove RLookupKey_Clone and RLookup_CloneInto and their respective tests

* Integrate hybrid_lookup_context to FT.HYBRID flow

* Remove leftovers and redundencies

* Refactor merge search results

* Refactor rlookup add&transfer fucntions,
Refactor their tests
Fix hybrid merger tests (include added parameter)

* enable YIELD_DISTANCE_AS cpp tests

* Add tests/pytests/test_hybrid_distance.py

* Fix memory leak

* Cleanups

* Fix test

* Enhance RLookup_AddKeysFrom to combine caller flags with source key properties, preserving non-transient flags.
Add tests for F_HIDDEN flag handling, verifying preservation and override behavior.

* CR comments

* CR comments

* CI fix

* CR comments

* Cleanup of lookup context in test_cpp_hybridmerger.cpp

* Fix memory leak

* CR comments

* spellcheck

* Refactor RLookup tests to use helper functions for keys initialization and values verification

* fix parsing of FT.HYBRID results in pytest

* MOD-10050: Tests with dialects and multithread (#6777)

* MOD-11048, MOD-11157: Add default normalizers to vector search (#6820)

* Implement RLookup_AddKeysFrom and RLookupRow_TransferFields functions with corresponding tests

* Remove RLookupKey_Clone and RLookup_CloneInto and their respective tests

* Integrate hybrid_lookup_context to FT.HYBRID flow

* Remove leftovers and redundencies

* Refactor merge search results

* Refactor rlookup add&transfer fucntions,
Refactor their tests
Fix hybrid merger tests (include added parameter)

* enable YIELD_DISTANCE_AS cpp tests

* Add tests/pytests/test_hybrid_distance.py

* Fix memory leak

* Cleanups

* Fix test

* Enhance RLookup_AddKeysFrom to combine caller flags with source key properties, preserving non-transient flags.
Add tests for F_HIDDEN flag handling, verifying preservation and override behavior.

* Implement RPVectorNormalizer and PLN_VectorNormalizerStep

* Intermediate work of vector normalization

* Add GetVecSimMetricFromVectorField function to extract metrics from vector field specifications

* Make default vector distance field hidden from response unless explicitly requested
Vector normalization step now reads distance from RLookup, normalizes it, and writes back
Always normalize vectors for both RFF and LINEAR scoring methods
Update test expectations to account for normalized L2 distances

* CR comments

* Implement RPVectorNormalizer and PLN_VectorNormalizerStep

* Intermediate work of vector normalization

* Add GetVecSimMetricFromVectorField function to extract metrics from vector field specifications

* Make default vector distance field hidden from response unless explicitly requested
Vector normalization step now reads distance from RLookup, normalizes it, and writes back
Always normalize vectors for both RFF and LINEAR scoring methods
Update test expectations to account for normalized L2 distances

* Add tests/pytests/test_hybrid_vector_normalizer.py
- test the yielded distance againt a calculated one

* Fix comments

* Fix tests to match new RLookups solution in FT.HYBRID

* Fix hybrid parsing tests as default scorefield is now enabled

* CR comments

* CR comments

* Move flag setting logic from aggregate_request to parse_hybrid

* Minor refactor of GetVecSimMetricFromVectorField

* new line EOF

* Add ['INT8', 'UINT8'] to test_hybrid_vector_normalizer

* CR comments

* CR comments

* Rename YIELD_DISTANCE_AS to YIELD_SCORE_AS
in FT.HYBRID VSIM

* Fix test_hybrid_multithread

* CR comnents

* Fix test_hybrid_dialects

* Minor fix

* MOD-11258: Add parameter count validation to COMBINE LINEAR clause (#6826)

* MOD-10047: Add Shard Cursor Support For _FT.HYBRID (#6697)

* hybrid request and hybrid parsing changes

* important fixes to tests and code

* revert unit-tests file changes

* fix memory leaks in unit tests

* fix memory leaks in flow tests

* code review comments

* initial commit

* small fix

* implemented basic functionality

* bring back commented out code, remove isHybrid

* post rebase on feature-RRF

* fix compilation post rebase

* trying to reduce amount of changes

* fix compilation post rebase

* some general fixes

* try and fix cpp-tests paths issue

* fix the hybrid tests, bring back support for running cpp-tests with gdb

* rename test file

* fixing the new cursor flow

* added flow tests, some minor fixes

* rebase fixes

* remove code which went in accidently

* revert again unit tests changes

* don't run internal command in cluster environment

* try and fix memory leak

* increase coverage for when cursor limit is reached, fix request lifetime management.

* fix typo

* remove free in execute since now lifetime is managed by ref counting.

* fix memory leak

* fix leaks in error flow

* fix StartCursors signature

* fix query error leaks

* compilation fixes post rebase

* remove some redundant changes

* remove some more redundant changes

* code review comments

* comments + result limit issue workaround

* minor fix

* code review comments

* try and fix tests

* move query test to be a cpp test since it needs redis mock, fix query error clear function

* code review comments, fix leak in query test, add flow test

* fix typo

* small fix

* fix if

* don't rely on done_depleting since it is set by depleter thread

* hybrid debug request should now also free the hybrid request

* fix code review comment

* MOD-11335: Sync master and feature-RRF branch (#6862)

* Fix after code review

* Update HasScorer()

* Revert reply.c changes

* Fix serializeVectorNormalizer()

---------

Co-authored-by: Raz Monsonego <[email protected]>
Co-authored-by: Itzikvaknin <[email protected]>
Co-authored-by: kei-nan <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: ofiryanai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.