[8.4] [MOD-12418] Track OOM errors and warnings in info (#7452)#7576
[8.4] [MOD-12418] Track OOM errors and warnings in info (#7452)#7576
Conversation
* Track timeout in sendchunk resp2 * Track timeout warning in sendSearchResults * Track timeout error in searchResultReducer (cherry picked from commit fafe1dc) * Track timeout in sendChunk_hybrid (cherry picked from commit 8529cb9) * test timeout metrics (cherry picked from commit c56a0d9) * fix isCoord check * Add query warning code and add function and fields needed to track (cherry picked from commit a414641) * Track timeout in sendchunk resp3 (cherry picked from commit 6853bb9) * readd skip * Update syntax and args error to new SA as cluster * format and enrico comment * Track OOM (cherry picked from commit de1a285aac27c73d4feca50abe3c2328f6959ce2) * fix warnings double counting * fix missing skip and logic * Change test to N=0 with Internal only (not working so revert afterwards) * Revert "Change test to N=0 with Internal only (not working so revert afterwards)" This reverts commit 829ac53. * meirav comments * Stablize tests * Add resp3 test * _disable_ hybrid sa timeout * Make test robust * fixup! Make test robust * remove limits * comments * Refactor warning tracking loop for clarity * Add test for warnings metric count with timeout * fix flaky (cherry picked from commit 9ccdf3e)
|
|
||
| if (req->queryOOM) { | ||
| QueryWarningsGlobalStats_UpdateWarning(QUERY_WARNING_CODE_OUT_OF_MEMORY_COORD, 1, COORD_ERR_WARN); | ||
| } |
There was a problem hiding this comment.
Bug: OOM warning double counted for RESP3 responses
The newly added OOM warning tracking at lines 2806-2808 runs for both RESP2 and RESP3 responses, but RESP3 already tracks OOM warnings at line 2711 within its else if (req->queryOOM) branch. When req->queryOOM is true and using RESP3, QueryWarningsGlobalStats_UpdateWarning is called twice, inflating the OOM warning counter. The tracking at lines 2806-2808 appears intended only for RESP2 (which has no other OOM tracking) but is positioned outside the if/else block.
Additional Locations (1)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 8.4 #7576 +/- ##
==========================================
- Coverage 85.95% 85.89% -0.06%
==========================================
Files 331 331
Lines 52667 52701 +34
Branches 12004 12004
==========================================
- Hits 45272 45270 -2
- Misses 7228 7264 +36
Partials 167 167
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
**Bug Fixes:** * [redis#7385](RediSearch/RediSearch#7385) Fix high temporary memory consumption when loading multiple search indexes from RDB * [redis#7430](RediSearch/RediSearch#7430) Fix a potential deadlock in `FT.HYBRID` in cluster mode during updates. * [redis#7454](RediSearch/RediSearch#7454) Fix a garbage collection performence regression * [redis#7460](RediSearch/RediSearch#7460) Fix potential double-free in Fork GC error paths * [redis#7455](RediSearch/RediSearch#7455) Fix internal cursors not being deleted promptly in cluster mode * [redis#7667](RediSearch/RediSearch#7667) Fix a cursor logical leak upon dropping the index * [redis#7796](RediSearch/RediSearch#7796) Fix a potential use-after-free when removing connections * [redis#7792](RediSearch/RediSearch#7792) Fix string comparison for binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [redis#7823](RediSearch/RediSearch#7823) Update `FT.HYBRID` to accept vector blobs only via parameters * [redis#7903](RediSearch/RediSearch#7903) Fix a memory leak in Hybrid ASM * [redis#8052](RediSearch/RediSearch#8052) Fix `FT.HYBRID` behavior when used with `LOAD *` * [redis#8082](RediSearch/RediSearch#8082) Fix incorrect FULLTEXT field metric counts * [redis#8089](RediSearch/RediSearch#8089) Fix an edge case in `CLUSTERSET` handling * [redis#8152](RediSearch/RediSearch#8152) Fix configuration registration issues **Improvements:** * [redis#7427](RediSearch/RediSearch#7427) Enhance `FT.PROFILE` with vector search execution details * [redis#7431](RediSearch/RediSearch#7431) Ensure full `FT.PROFILE` output is returned on timeout with RETURN policy * [redis#7507](RediSearch/RediSearch#7507) Track timeout warnings and errors in INFO * [redis#7576](RediSearch/RediSearch#7576) Track OOM warnings and errors in INFO * [redis#7612](RediSearch/RediSearch#7612) Track `maxprefixexpansions` warnings and errors in INFO * [redis#7960](RediSearch/RediSearch#7960) Persist query warnings across cursor reads * [redis#7551](RediSearch/RediSearch#7551), [redis#7616](RediSearch/RediSearch#7616), [redis#7622](RediSearch/RediSearch#7622), [redis#7625](RediSearch/RediSearch#7625) Add runtime thread and pending-jobs metrics * [redis#7589](RediSearch/RediSearch#7589) Support multiple slot ranges in `search.CLUSTERSET` * [redis#7707](RediSearch/RediSearch#7707) Add `WITHCOUNT` support to `FT.AGGREGATE` * [redis#7862](RediSearch/RediSearch#7862) Add support for subquery `COUNT` in `FT.HYBRID` * [redis#8087](RediSearch/RediSearch#8087) Add warnings when cursor results may be affected by ASM and expose ASM warnings in `FT.PROFILE` * [redis#8049](RediSearch/RediSearch#8049) Add logging for index-related commands * [redis#8150](RediSearch/RediSearch#8150) Fix shard total profile time reporting in `FT.PROFILE`
**Bug Fixes:** * [#7385](RediSearch/RediSearch#7385) Fix high temporary memory consumption when loading multiple search indexes from RDB * [#7430](RediSearch/RediSearch#7430) Fix a potential deadlock in `FT.HYBRID` in cluster mode during updates. * [#7454](RediSearch/RediSearch#7454) Fix a garbage collection performence regression * [#7460](RediSearch/RediSearch#7460) Fix potential double-free in Fork GC error paths * [#7455](RediSearch/RediSearch#7455) Fix internal cursors not being deleted promptly in cluster mode * [#7667](RediSearch/RediSearch#7667) Fix a cursor logical leak upon dropping the index * [#7796](RediSearch/RediSearch#7796) Fix a potential use-after-free when removing connections * [#7792](RediSearch/RediSearch#7792) Fix string comparison for binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [#7704](RediSearch/RediSearch#7704) Use asynchronous jobs for GC in SVS to accelerate execution * [#7823](RediSearch/RediSearch#7823) Update `FT.HYBRID` to accept vector blobs only via parameters * [#7903](RediSearch/RediSearch#7903) Fix a memory leak in Hybrid ASM * [#8052](RediSearch/RediSearch#8052) Fix `FT.HYBRID` behavior when used with `LOAD *` * [#8082](RediSearch/RediSearch#8082) Fix incorrect FULLTEXT field metric counts * [#8089](RediSearch/RediSearch#8089) Fix an edge case in `CLUSTERSET` handling * [#8152](RediSearch/RediSearch#8152) Fix configuration registration issues **Improvements:** * [#7427](RediSearch/RediSearch#7427) Enhance `FT.PROFILE` with vector search execution details * [#7431](RediSearch/RediSearch#7431) Ensure full `FT.PROFILE` output is returned on timeout with RETURN policy * [#7507](RediSearch/RediSearch#7507) Track timeout warnings and errors in INFO * [#7576](RediSearch/RediSearch#7576) Track OOM warnings and errors in INFO * [#7612](RediSearch/RediSearch#7612) Track `maxprefixexpansions` warnings and errors in INFO * [#7960](RediSearch/RediSearch#7960) Persist query warnings across cursor reads * [#7551](RediSearch/RediSearch#7551), [#7616](RediSearch/RediSearch#7616), [#7622](RediSearch/RediSearch#7622), [#7625](RediSearch/RediSearch#7625) Add runtime thread and pending-jobs metrics * [#7589](RediSearch/RediSearch#7589) Support multiple slot ranges in `search.CLUSTERSET` * [#7707](RediSearch/RediSearch#7707) Add `WITHCOUNT` support to `FT.AGGREGATE` * [#7862](RediSearch/RediSearch#7862) Add support for subquery `COUNT` in `FT.HYBRID` * [#8087](RediSearch/RediSearch#8087) Add warnings when cursor results may be affected by ASM and expose ASM warnings in `FT.PROFILE` * [#8049](RediSearch/RediSearch#8049) Add logging for index-related commands * [#8150](RediSearch/RediSearch#8150) Fix shard total profile time reporting in `FT.PROFILE`
backport #7452 to 8.4
Note
Track and expose OOM query errors and warnings across coordinator and shards, updating execution paths and INFO output, with tests covering standalone/cluster and RESP2/RESP3.
QueryErrorsGlobalStats/QueryWarningGlobalStatsand include them inTotalGlobalStats_GetQueryStatsandINFO MODULESoutput (both shard and coordinator sections).QueryErrorsGlobalStats_UpdateError/QueryWarningsGlobalStats_UpdateWarningto handle OOM codes (QUERY_EOOM,QUERY_WARNING_CODE_OUT_OF_MEMORY_{COORD,SHARD}).aggregate_exec.c,reply_empty.c,hybrid_exec.c,module.c).Written by Cursor Bugbot for commit fdd7a03. This will update automatically on new commits. Configure here.