Skip to content

[8.2] MOD-12234: Fix totalDocsLen updates (#7227)#7264

Merged
raz-mon merged 1 commit into8.2from
razmon-cp_8.2_totalDocsLen
Nov 10, 2025
Merged

[8.2] MOD-12234: Fix totalDocsLen updates (#7227)#7264
raz-mon merged 1 commit into8.2from
razmon-cp_8.2_totalDocsLen

Conversation

@raz-mon
Copy link
Collaborator

@raz-mon raz-mon commented Nov 9, 2025

CP of #7227 to 8.2.


Note

Refactors document deletion to use DocTable_Pop and updates numDocuments/totalDocsLen reliably (with assertions), removes obsolete old-metadata tracking, and adjusts tests/expected scores accordingly.

  • Core/Deletion & Stats:
    • Replace DocTable_Delete with DocTable_Pop across codepaths (indexer.c, redisearch_api.c, spec.c); clarify ref handling with DMD_Return.
    • On replace/delete, decrement numDocuments and subtract totalDocsLen with sanity assertions to prevent underflow.
    • Ensure vector/geometry index cleanup on deletes.
  • Add/Index Context:
    • Remove oldMd from RSAddDocumentCtx and related bookkeeping.
  • DocTable:
    • Adjust pop path to unchain, update memory accounting/sortables size, and return borrowed ref comment.
  • Tests:
    • Update C++ test to use DocTable_Pop and explicit DMD_Return.
    • Add BM25 average doc length regression test; tweak BM25/BM25STD expected scores; keep TFIDF.DOCNORM unchanged.

Written by Cursor Bugbot for commit 078be64. This will update automatically on new commits. Configure here.

* Fix BM25STD underflow wraparound

* Fix comment

* Fix documentation and tweak test

* Address review

* Remove old test

* Skip cluster on expiration test

* Fix leak in test

* Switch to rust ii (#6958)

* Switch to Rust inverted index

* Remove encoder and decoder getter usage

It is no longer possible to get the encoders or decoders from the flags.
Instead they are coupled to the inverted index or its reader.

* Update `testNumericInverted`

* Update testIndexFlags

* Don't check buffer growth in test

There is no reason for the test to check the internals of these structures.

* Fix size test on flags

* Fix size test for numeric index

* Fix size test for tag index

* Fix sizes on LLapi

* Update fork tests to const

* C GC updates

* Update fork tests

* Don't free deltas twice

* Fix sizes in Python tests

* Recreate index to have compression turned on

* Remove impossible test

This test requires a mutable access to the inverted index in a reader
and direct access to the repair block call. Both of these are not
possible so the test is removed.

* Remove C benchmarks

* Move header out of future

* Remove from iterator benches

* Fix symbols for GC benchmark

* Update iterator benchmark FFI

* Get num_entries on index directly

* Trim bench dependencies

* Fix filters in CPP tests

The text inverted indices expect to have a field mask filter, not the
`None` case.

* Update debug assertions

It is now possible to have an inverted index with no blocks in it.

* Partially fix totalDocLen metric

* Fix

* Make test not pass by accident

* Remove var

* Update totalDocLen on deletions as well

* Fix tests

* Fix assertions

* Clean up unused old dmd

* Add assertions to llapi

* Remove Delete API

* Skip unnecesary cluster tests

---------

Co-authored-by: Pieter <[email protected]>
@raz-mon raz-mon requested review from alonre24 and oshadmi November 9, 2025 16:26
@github-actions github-actions bot added the size:M label Nov 9, 2025
@raz-mon raz-mon enabled auto-merge November 9, 2025 16:29
@codecov
Copy link

codecov bot commented Nov 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.37%. Comparing base (096ea0f) to head (078be64).
⚠️ Report is 2 commits behind head on 8.2.

Additional details and impacted files
@@            Coverage Diff             @@
##              8.2    #7264      +/-   ##
==========================================
- Coverage   89.41%   89.37%   -0.04%     
==========================================
  Files         253      253              
  Lines       40801    40800       -1     
  Branches     3725     3725              
==========================================
- Hits        36482    36467      -15     
- Misses       4270     4284      +14     
  Partials       49       49              
Flag Coverage Δ
flow 82.03% <64.70%> (-0.17%) ⬇️
unit 47.33% <64.70%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@raz-mon raz-mon added this pull request to the merge queue Nov 9, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 9, 2025
@raz-mon raz-mon added this pull request to the merge queue Nov 10, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 10, 2025
@raz-mon raz-mon added this pull request to the merge queue Nov 10, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 10, 2025
@raz-mon raz-mon added this pull request to the merge queue Nov 10, 2025
Merged via the queue into 8.2 with commit a096a23 Nov 10, 2025
19 checks passed
@raz-mon raz-mon deleted the razmon-cp_8.2_totalDocsLen branch November 10, 2025 13:43
alonre24 added a commit to alonre24/redis that referenced this pull request Jan 26, 2026
**Bug Fixes:**

* [redis#7219](RediSearch/RediSearch#7219) Fix a concurrency issue on Reducer in `FT.AGGREGATE`
* [redis#7255](RediSearch/RediSearch#7255) Fix `BM25STD` underflow wraparound
* [redis#7275](RediSearch/RediSearch#7275) Report used memory as `unsigned long long` to avoid underflows
* [redis#7264](RediSearch/RediSearch#7264) Fix `totalDocsLen` updates
* [redis#6995](RediSearch/RediSearch#6995) Do not fanout `FT.INFO` to replicas
* [redis#7350](RediSearch/RediSearch#7350) Fix `FT.CREATE` failure with LeanVec parameters on non-Intel architectures
* [redis#7384](RediSearch/RediSearch#7384) Fix index load from RDB temporary memory overhead
* [redis#7459](RediSearch/RediSearch#7459) Fix Fork GC potential double-free on error path
* [redis#7458](RediSearch/RediSearch#7458) Fix a GC performence regression
* [redis#7470](RediSearch/RediSearch#7470) Avoid draining worker thread pool from FLUSH callback to avoid deadlocks
* [redis#7554](RediSearch/RediSearch#7554) Handle the case where `SCORE` is sent alone without extra fields (coordinator)
* [redis#7685](RediSearch/RediSearch#7685) Fix cursor logical leak
* [redis#7794](RediSearch/RediSearch#7794) Fix `cmp_strings()` to correctly handle binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE
* [redis#7873](RediSearch/RediSearch#7873) Handle warnings in empty `FT.AGGREGATE` replies (cluster)
* [redis#7886](RediSearch/RediSearch#7886) Remove non-TEXT fields from the spec keys dictionary
* [redis#7904](RediSearch/RediSearch#7904) Refactor keys dictionary handling
* [redis#7901](RediSearch/RediSearch#7901) Support multiple warnings in reply
* [redis#8083](RediSearch/RediSearch#8083) Fix incorrect FULLTEXT field metric counts
* [redis#8153](RediSearch/RediSearch#8153) Fix configuration registration issues

**Improvements:**

* [redis#7154](RediSearch/RediSearch#7154) `FT.AGGREGATE` can return Background Indexing OOM warnings
* [redis#7083](RediSearch/RediSearch#7083) Add the default text scorer as a configuration option
* [redis#7341](RediSearch/RediSearch#7341) Rename `FT.PROFILE` counter fields
* [redis#7436](RediSearch/RediSearch#7436) Enhance `FT.PROFILE` with vector search execution details
* [redis#7435](RediSearch/RediSearch#7435) Ensure full `FT.PROFILE` output on timeout with RETURN policy
* [redis#7534](RediSearch/RediSearch#7534) Reduce the number of worker threads asynchronously to avoid deadlocks during queries
* [redis#7614](RediSearch/RediSearch#7614) Track timeout warnings and errors in INFO
* [redis#7646](RediSearch/RediSearch#7646) Track `maxprefixexpansions` warnings and errors in INFO
* [redis#7577](RediSearch/RediSearch#7577) Track query syntax/argument errors (basis for query error tracking)
* [redis#7737](RediSearch/RediSearch#7737) Add `Internal cursor reads` metric to cluster `FT.PROFILE` output
* [redis#7759](RediSearch/RediSearch#7759) Extend indexing metrics
* [redis#7710](RediSearch/RediSearch#7710) Support `WITHCOUNT` keyword in `FT.AGGREGATE`
* [redis#7957](RediSearch/RediSearch#7957) Persist query warnings across cursor reads
* [redis#8054](RediSearch/RediSearch#8054) Add logging for index-related commands
* [redis#8151](RediSearch/RediSearch#8151) Fix shard total profile time reporting in `FT.PROFILE`
* [redis#8103](RediSearch/RediSearch#8103) Output current thread IndexSpec information on crash
YaacovHazan pushed a commit to redis/redis that referenced this pull request Jan 26, 2026
**Bug Fixes:**

* [#7219](RediSearch/RediSearch#7219) Fix a
concurrency issue on Reducer in `FT.AGGREGATE`
* [#7255](RediSearch/RediSearch#7255) Fix
`BM25STD` underflow wraparound
* [#7275](RediSearch/RediSearch#7275) Report
used memory as `unsigned long long` to avoid underflows
* [#7264](RediSearch/RediSearch#7264) Fix
`totalDocsLen` updates
* [#6995](RediSearch/RediSearch#6995) Do not
fanout `FT.INFO` to replicas
* [#7350](RediSearch/RediSearch#7350) Fix
`FT.CREATE` failure with LeanVec parameters on non-Intel architectures
* [#7694](RediSearch/RediSearch#7694) Use
asynchronous jobs for GC in SVS to accelerate execution
* [#7384](RediSearch/RediSearch#7384) Fix index
load from RDB temporary memory overhead
* [#7459](RediSearch/RediSearch#7459) Fix Fork
GC potential double-free on error path
* [#7458](RediSearch/RediSearch#7458) Fix a GC
performence regression
* [#7470](RediSearch/RediSearch#7470) Avoid
draining worker thread pool from FLUSH callback to avoid deadlocks
* [#7554](RediSearch/RediSearch#7554) Handle the
case where `SCORE` is sent alone without extra fields (coordinator)
* [#7685](RediSearch/RediSearch#7685) Fix cursor
logical leak
* [#7794](RediSearch/RediSearch#7794) Fix
`cmp_strings()` to correctly handle binary data with embedded NULLs in
TOLIST reducer in FT.AGGREGATE
* [#7873](RediSearch/RediSearch#7873) Handle
warnings in empty `FT.AGGREGATE` replies (cluster)
* [#7886](RediSearch/RediSearch#7886) Remove
non-TEXT fields from the spec keys dictionary
* [#7904](RediSearch/RediSearch#7904) Refactor
keys dictionary handling
* [#7901](RediSearch/RediSearch#7901) Support
multiple warnings in reply
* [#8083](RediSearch/RediSearch#8083) Fix
incorrect FULLTEXT field metric counts
* [#8153](RediSearch/RediSearch#8153) Fix
configuration registration issues

**Improvements:**

* [#7154](RediSearch/RediSearch#7154)
`FT.AGGREGATE` can return Background Indexing OOM warnings
* [#7083](RediSearch/RediSearch#7083) Add the
default text scorer as a configuration option
* [#7341](RediSearch/RediSearch#7341) Rename
`FT.PROFILE` counter fields
* [#7436](RediSearch/RediSearch#7436) Enhance
`FT.PROFILE` with vector search execution details
* [#7435](RediSearch/RediSearch#7435) Ensure
full `FT.PROFILE` output on timeout with RETURN policy
* [#7534](RediSearch/RediSearch#7534) Reduce the
number of worker threads asynchronously to avoid deadlocks during
queries
* [#7614](RediSearch/RediSearch#7614) Track
timeout warnings and errors in INFO
* [#7646](RediSearch/RediSearch#7646) Track
`maxprefixexpansions` warnings and errors in INFO
* [#7577](RediSearch/RediSearch#7577) Track
query syntax/argument errors (basis for query error tracking)
* [#7737](RediSearch/RediSearch#7737) Add
`Internal cursor reads` metric to cluster `FT.PROFILE` output
* [#7759](RediSearch/RediSearch#7759) Extend
indexing metrics
* [#7710](RediSearch/RediSearch#7710) Support
`WITHCOUNT` keyword in `FT.AGGREGATE`
* [#7957](RediSearch/RediSearch#7957) Persist
query warnings across cursor reads
* [#8054](RediSearch/RediSearch#8054) Add
logging for index-related commands
* [#8151](RediSearch/RediSearch#8151) Fix shard
total profile time reporting in `FT.PROFILE`
* [#8103](RediSearch/RediSearch#8103) Output
current thread IndexSpec information on crash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants