Skip to content

[8.2] MOD-12984 Persist query warnings across cursor reads#7957

Merged
meiravgri merged 2 commits into8.2from
backport-7865-to-8.2
Jan 1, 2026
Merged

[8.2] MOD-12984 Persist query warnings across cursor reads#7957
meiravgri merged 2 commits into8.2from
backport-7865-to-8.2

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Dec 31, 2025

backport #7865 to 8.2


Note

Improves reliability of profiling output and warning persistence across cursor reads and empty-result paths.

  • Introduces ProfilePrinterCtx on AREQ to persist a warnings bitset and cursor_reads; removes ad‑hoc timeout/cursor state
  • Collects and stores warnings (TIMEOUT, MAX_PREFIX_EXPANSIONS, BG_SCAN_OOM) during execution in both RESP2/3, including empty-result replies; increments cursor_reads from cursor runner
  • Changes profile callbacks to accept AREQ* and read from AREQ_ProfilePrinterCtx(req); coordinator now drains shard replies and prints profile via Profile_PrintInFormat
  • Coordinator RESP3 parsing now scans all warnings for timeouts; ensures CURSOR PROFILE/DEL flow after timeouts
  • FT.PROFILE output shows aggregated warnings and internal cursor read count when applicable
  • Adds tests to validate BG scan OOM warning presence, warning persistence on timeout/empty replies, RESP2/RESP3 behavior, and max prefix expansions warning across multiple cursor reads

Written by Cursor Bugbot for commit 3060f78. This will update automatically on new commits. Configure here.

* bit map place holder

* save profilectx in AREQ

introduce ProfileWarnings

* return all warnings in coord reply

return all warnings in sendchunk resp3

return all warnings as an array in profil;e

* align tests to the new format

 add testProfileMultipleWarnings and testPersistProfileWarning_MaxPrefixExpansions

 and test_multiple_warnings

* more fixes

* test fixes

* revert multiple warnings code relted changes

* fixes

* remove tests relarted to #7892

* imp testPersistProfileWarning_MaxPrefixExpansions

* remove temporary test

* fix mising BG indexing OOM

* fix test

* move to header

* add oom test

* asserts

* detect timeout when there are multiple warnings in netCursorCallback

* fix assert

* enhance test_multiple_warnings to test we have exactly 1 cursor read

* fix assert

* test with oom (remove in the next commit)

* fix test - only test with timeout

* comment on ProfileWarningType

* fix test

* remove unused

(cherry picked from commit 2f47432)
@meiravgri meiravgri changed the title Backport-7865-to-8.2 [8.2] MOD-12984 Persist query warnings across cursor reads Dec 31, 2025
@meiravgri meiravgri changed the base branch from master to 8.2 December 31, 2025 17:33
@meiravgri meiravgri requested a review from lerman25 December 31, 2025 17:35
Copy link

@jit-ci jit-ci bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Jit has detected 13 important findings in this PR that you should review.


Jit encountered an internal error and cannot comment on each finding.

You can ask a Jit admin to comment #jit_ignore_all on this PR to ignore the findings.

Here are the findings in this PR:

  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-test.yml
    • Lines: 399-402
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-test.yml
    • Lines: 138-159
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-test.yml
    • Lines: 476-476
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/benchmark-flow.yml
    • Lines: 60-60
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-test.yml
    • Lines: 461-461
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-get-config.yml
    • Lines: 55-287
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/flow-micro-benchmarks.yml
    • Lines: 30-62
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-test.yml
    • Lines: 408-413
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/task-build-artifacts.yml
    • Lines: 257-259
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/generate-basic-matrix.yml
    • Lines: 39-74
  • Security Control: Static Code Analysis Semgrep Pro
    • Type: Yaml.Github-Actions.Security.Run-Shell-Injection.Run-Shell-Injection
    • Description: Using variable interpolation ${{...}} with github context data in a run: step could allow an attacker to inject their own code into the runner. This would allow them to steal secrets and code. github context data can have arbitrary user input and should be treated as untrusted. Instead, use an intermediate environment variable with env: to store the data and use the environment variable in the run: script. Be sure to use double-quotes the environment variable, like this: "$ENVVAR".
    • Severity: HIGH
    • Learn More: Link
    • Filename: .github/workflows/generate-matrix.yml
    • Lines: 43-117
  • Security Control: Docker Scan
    • Type: Image User Should Not Be 'Root'
    • Description: Running containers with 'root' user can lead to a container escape situation. It is a best practice to run containers as non-root users, which can be done by adding a 'USER' statement to the Dockerfile.
    • Severity: HIGH
    • Learn More: Link
    • Filename: Dockerfile
    • Lines: 1-1
  • Security Control: Docker Scan
    • Type: 'Apt-Get' Missing '--No-Install-Recommends'
    • Description: 'apt-get' install should use '--no-install-recommends' to minimize image size.
    • Severity: HIGH
    • Learn More: Link
    • Filename: Dockerfile
    • Lines: 15-19

@codecov
Copy link

codecov bot commented Dec 31, 2025

Codecov Report

❌ Patch coverage is 97.77778% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 89.58%. Comparing base (cd61cb5) to head (3060f78).
⚠️ Report is 2 commits behind head on 8.2.

Files with missing lines Patch % Lines
src/aggregate/aggregate_exec.c 95.83% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              8.2    #7957      +/-   ##
==========================================
- Coverage   89.63%   89.58%   -0.05%     
==========================================
  Files         256      257       +1     
  Lines       41599    41606       +7     
  Branches     3719     3719              
==========================================
- Hits        37286    37273      -13     
- Misses       4264     4284      +20     
  Partials       49       49              
Flag Coverage Δ
flow 82.35% <97.77%> (-0.86%) ⬇️
unit 47.19% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@meiravgri meiravgri enabled auto-merge January 1, 2026 08:38
@meiravgri meiravgri added this pull request to the merge queue Jan 1, 2026
Merged via the queue into 8.2 with commit 3502cec Jan 1, 2026
32 checks passed
@meiravgri meiravgri deleted the backport-7865-to-8.2 branch January 1, 2026 12:11
alonre24 added a commit to alonre24/redis that referenced this pull request Jan 26, 2026
**Bug Fixes:**

* [redis#7219](RediSearch/RediSearch#7219) Fix a concurrency issue on Reducer in `FT.AGGREGATE`
* [redis#7255](RediSearch/RediSearch#7255) Fix `BM25STD` underflow wraparound
* [redis#7275](RediSearch/RediSearch#7275) Report used memory as `unsigned long long` to avoid underflows
* [redis#7264](RediSearch/RediSearch#7264) Fix `totalDocsLen` updates
* [redis#6995](RediSearch/RediSearch#6995) Do not fanout `FT.INFO` to replicas
* [redis#7350](RediSearch/RediSearch#7350) Fix `FT.CREATE` failure with LeanVec parameters on non-Intel architectures
* [redis#7384](RediSearch/RediSearch#7384) Fix index load from RDB temporary memory overhead
* [redis#7459](RediSearch/RediSearch#7459) Fix Fork GC potential double-free on error path
* [redis#7458](RediSearch/RediSearch#7458) Fix a GC performence regression
* [redis#7470](RediSearch/RediSearch#7470) Avoid draining worker thread pool from FLUSH callback to avoid deadlocks
* [redis#7554](RediSearch/RediSearch#7554) Handle the case where `SCORE` is sent alone without extra fields (coordinator)
* [redis#7685](RediSearch/RediSearch#7685) Fix cursor logical leak
* [redis#7794](RediSearch/RediSearch#7794) Fix `cmp_strings()` to correctly handle binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE
* [redis#7873](RediSearch/RediSearch#7873) Handle warnings in empty `FT.AGGREGATE` replies (cluster)
* [redis#7886](RediSearch/RediSearch#7886) Remove non-TEXT fields from the spec keys dictionary
* [redis#7904](RediSearch/RediSearch#7904) Refactor keys dictionary handling
* [redis#7901](RediSearch/RediSearch#7901) Support multiple warnings in reply
* [redis#8083](RediSearch/RediSearch#8083) Fix incorrect FULLTEXT field metric counts
* [redis#8153](RediSearch/RediSearch#8153) Fix configuration registration issues

**Improvements:**

* [redis#7154](RediSearch/RediSearch#7154) `FT.AGGREGATE` can return Background Indexing OOM warnings
* [redis#7083](RediSearch/RediSearch#7083) Add the default text scorer as a configuration option
* [redis#7341](RediSearch/RediSearch#7341) Rename `FT.PROFILE` counter fields
* [redis#7436](RediSearch/RediSearch#7436) Enhance `FT.PROFILE` with vector search execution details
* [redis#7435](RediSearch/RediSearch#7435) Ensure full `FT.PROFILE` output on timeout with RETURN policy
* [redis#7534](RediSearch/RediSearch#7534) Reduce the number of worker threads asynchronously to avoid deadlocks during queries
* [redis#7614](RediSearch/RediSearch#7614) Track timeout warnings and errors in INFO
* [redis#7646](RediSearch/RediSearch#7646) Track `maxprefixexpansions` warnings and errors in INFO
* [redis#7577](RediSearch/RediSearch#7577) Track query syntax/argument errors (basis for query error tracking)
* [redis#7737](RediSearch/RediSearch#7737) Add `Internal cursor reads` metric to cluster `FT.PROFILE` output
* [redis#7759](RediSearch/RediSearch#7759) Extend indexing metrics
* [redis#7710](RediSearch/RediSearch#7710) Support `WITHCOUNT` keyword in `FT.AGGREGATE`
* [redis#7957](RediSearch/RediSearch#7957) Persist query warnings across cursor reads
* [redis#8054](RediSearch/RediSearch#8054) Add logging for index-related commands
* [redis#8151](RediSearch/RediSearch#8151) Fix shard total profile time reporting in `FT.PROFILE`
* [redis#8103](RediSearch/RediSearch#8103) Output current thread IndexSpec information on crash
YaacovHazan pushed a commit to redis/redis that referenced this pull request Jan 26, 2026
**Bug Fixes:**

* [#7219](RediSearch/RediSearch#7219) Fix a
concurrency issue on Reducer in `FT.AGGREGATE`
* [#7255](RediSearch/RediSearch#7255) Fix
`BM25STD` underflow wraparound
* [#7275](RediSearch/RediSearch#7275) Report
used memory as `unsigned long long` to avoid underflows
* [#7264](RediSearch/RediSearch#7264) Fix
`totalDocsLen` updates
* [#6995](RediSearch/RediSearch#6995) Do not
fanout `FT.INFO` to replicas
* [#7350](RediSearch/RediSearch#7350) Fix
`FT.CREATE` failure with LeanVec parameters on non-Intel architectures
* [#7694](RediSearch/RediSearch#7694) Use
asynchronous jobs for GC in SVS to accelerate execution
* [#7384](RediSearch/RediSearch#7384) Fix index
load from RDB temporary memory overhead
* [#7459](RediSearch/RediSearch#7459) Fix Fork
GC potential double-free on error path
* [#7458](RediSearch/RediSearch#7458) Fix a GC
performence regression
* [#7470](RediSearch/RediSearch#7470) Avoid
draining worker thread pool from FLUSH callback to avoid deadlocks
* [#7554](RediSearch/RediSearch#7554) Handle the
case where `SCORE` is sent alone without extra fields (coordinator)
* [#7685](RediSearch/RediSearch#7685) Fix cursor
logical leak
* [#7794](RediSearch/RediSearch#7794) Fix
`cmp_strings()` to correctly handle binary data with embedded NULLs in
TOLIST reducer in FT.AGGREGATE
* [#7873](RediSearch/RediSearch#7873) Handle
warnings in empty `FT.AGGREGATE` replies (cluster)
* [#7886](RediSearch/RediSearch#7886) Remove
non-TEXT fields from the spec keys dictionary
* [#7904](RediSearch/RediSearch#7904) Refactor
keys dictionary handling
* [#7901](RediSearch/RediSearch#7901) Support
multiple warnings in reply
* [#8083](RediSearch/RediSearch#8083) Fix
incorrect FULLTEXT field metric counts
* [#8153](RediSearch/RediSearch#8153) Fix
configuration registration issues

**Improvements:**

* [#7154](RediSearch/RediSearch#7154)
`FT.AGGREGATE` can return Background Indexing OOM warnings
* [#7083](RediSearch/RediSearch#7083) Add the
default text scorer as a configuration option
* [#7341](RediSearch/RediSearch#7341) Rename
`FT.PROFILE` counter fields
* [#7436](RediSearch/RediSearch#7436) Enhance
`FT.PROFILE` with vector search execution details
* [#7435](RediSearch/RediSearch#7435) Ensure
full `FT.PROFILE` output on timeout with RETURN policy
* [#7534](RediSearch/RediSearch#7534) Reduce the
number of worker threads asynchronously to avoid deadlocks during
queries
* [#7614](RediSearch/RediSearch#7614) Track
timeout warnings and errors in INFO
* [#7646](RediSearch/RediSearch#7646) Track
`maxprefixexpansions` warnings and errors in INFO
* [#7577](RediSearch/RediSearch#7577) Track
query syntax/argument errors (basis for query error tracking)
* [#7737](RediSearch/RediSearch#7737) Add
`Internal cursor reads` metric to cluster `FT.PROFILE` output
* [#7759](RediSearch/RediSearch#7759) Extend
indexing metrics
* [#7710](RediSearch/RediSearch#7710) Support
`WITHCOUNT` keyword in `FT.AGGREGATE`
* [#7957](RediSearch/RediSearch#7957) Persist
query warnings across cursor reads
* [#8054](RediSearch/RediSearch#8054) Add
logging for index-related commands
* [#8151](RediSearch/RediSearch#8151) Fix shard
total profile time reporting in `FT.PROFILE`
* [#8103](RediSearch/RediSearch#8103) Output
current thread IndexSpec information on crash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants