[8.2] [MOD-12419] Track Timeout errors and warnings in info#7614
[8.2] [MOD-12419] Track Timeout errors and warnings in info#7614
Conversation
* [MOD-12419] Track Timeout errors and warnings in info ([#7414](#7414)) * Track timeout in sendchunk resp2 * Track timeout warning in sendSearchResults * Track timeout error in searchResultReducer (cherry picked from commit fafe1dc) * Track timeout in sendChunk_hybrid (cherry picked from commit 8529cb9) * test timeout metrics (cherry picked from commit c56a0d9) * fix isCoord check * Add query warning code and add function and fields needed to track (cherry picked from commit a414641) * Track timeout in sendchunk resp3 (cherry picked from commit 6853bb9) * readd skip * Update syntax and args error to new SA as cluster * format and enrico comment * Change test to N=0 with Internal only (not working so revert afterwards) * Revert "Change test to N=0 with Internal only (not working so revert afterwards)" This reverts commit 829ac53. * meirav comments * Stablize tests * Add resp3 test * _disable_ hybrid sa timeout * Make test robust * fixup! Make test robust * remove limits (cherry picked from commit 6eee260) * Align to 8.4 (cherry picked from commit ac92116)
tests/pytests/test_info_modules.py
Outdated
| shard_conn = env.getConnection(shardId) | ||
| _verify_metrics_not_changed(env, shard_conn, before_info_dicts[shardId], tested_in_this_test) | ||
|
|
||
| # @skip(cluster=False) |
There was a problem hiding this comment.
Bug: Commented-out decorator accidentally committed to test file
A commented-out decorator # @skip(cluster=False) was added before the test_active_io_threads_stats function. This appears to be a remnant from testing or debugging that was accidentally left in the codebase. Commented-out decorators serve no functional purpose and add noise to the code.
| return (!errStr | ||
| || RSGlobalConfig.requestConfigParams.timeoutPolicy == TimeoutPolicy_Fail | ||
| || strcmp(errStr, "Timeout limit was reached")); | ||
| // TODO - get RequestConfig ptr as parameter instead of global config |
There was a problem hiding this comment.
is the comment relevant?
f7c045b to
d64876c
Compare
| } | ||
| if (rc == RS_RESULT_TIMEDOUT) { | ||
| // Track warnings in global statistics | ||
| QueryWarningsGlobalStats_UpdateWarning(QUERY_WARNING_CODE_TIMED_OUT, 1, !IsInternal(req)); |
There was a problem hiding this comment.
Bug: RESP3 timeout warning tracking misses certain timeout conditions
The RESP3 code path (sendChunk_Resp3) tracks timeout warnings only when rc == RS_RESULT_TIMEDOUT, while the RESP2 code path (sendChunk_Resp2) uses the broader has_timedout condition which includes (rc == RS_RESULT_TIMEDOUT) || hasTimeoutError(req->qiter.err). This inconsistency means timeouts recorded in req->qiter.err but not reflected in rc will be tracked in RESP2 but missed in RESP3. The RESP3 warning tracking at line 662 uses a narrower check than the has_timedout variable computed at line 678.
Additional Locations (1)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 8.2 #7614 +/- ##
==========================================
- Coverage 89.54% 89.51% -0.04%
==========================================
Files 253 253
Lines 41236 41290 +54
Branches 3725 3725
==========================================
+ Hits 36924 36959 +35
- Misses 4263 4282 +19
Partials 49 49
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/backport |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.10
git worktree add -d .worktree/backport-7614-to-2.10 origin/2.10
cd .worktree/backport-7614-to-2.10
git switch --create backport-7614-to-2.10
git cherry-pick -x c922ddcb8b7b4d859a60d22b2d882ebc6ba1c058 |
**Bug Fixes:** * [redis#7219](RediSearch/RediSearch#7219) Fix a concurrency issue on Reducer in `FT.AGGREGATE` * [redis#7255](RediSearch/RediSearch#7255) Fix `BM25STD` underflow wraparound * [redis#7275](RediSearch/RediSearch#7275) Report used memory as `unsigned long long` to avoid underflows * [redis#7264](RediSearch/RediSearch#7264) Fix `totalDocsLen` updates * [redis#6995](RediSearch/RediSearch#6995) Do not fanout `FT.INFO` to replicas * [redis#7350](RediSearch/RediSearch#7350) Fix `FT.CREATE` failure with LeanVec parameters on non-Intel architectures * [redis#7384](RediSearch/RediSearch#7384) Fix index load from RDB temporary memory overhead * [redis#7459](RediSearch/RediSearch#7459) Fix Fork GC potential double-free on error path * [redis#7458](RediSearch/RediSearch#7458) Fix a GC performence regression * [redis#7470](RediSearch/RediSearch#7470) Avoid draining worker thread pool from FLUSH callback to avoid deadlocks * [redis#7554](RediSearch/RediSearch#7554) Handle the case where `SCORE` is sent alone without extra fields (coordinator) * [redis#7685](RediSearch/RediSearch#7685) Fix cursor logical leak * [redis#7794](RediSearch/RediSearch#7794) Fix `cmp_strings()` to correctly handle binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [redis#7873](RediSearch/RediSearch#7873) Handle warnings in empty `FT.AGGREGATE` replies (cluster) * [redis#7886](RediSearch/RediSearch#7886) Remove non-TEXT fields from the spec keys dictionary * [redis#7904](RediSearch/RediSearch#7904) Refactor keys dictionary handling * [redis#7901](RediSearch/RediSearch#7901) Support multiple warnings in reply * [redis#8083](RediSearch/RediSearch#8083) Fix incorrect FULLTEXT field metric counts * [redis#8153](RediSearch/RediSearch#8153) Fix configuration registration issues **Improvements:** * [redis#7154](RediSearch/RediSearch#7154) `FT.AGGREGATE` can return Background Indexing OOM warnings * [redis#7083](RediSearch/RediSearch#7083) Add the default text scorer as a configuration option * [redis#7341](RediSearch/RediSearch#7341) Rename `FT.PROFILE` counter fields * [redis#7436](RediSearch/RediSearch#7436) Enhance `FT.PROFILE` with vector search execution details * [redis#7435](RediSearch/RediSearch#7435) Ensure full `FT.PROFILE` output on timeout with RETURN policy * [redis#7534](RediSearch/RediSearch#7534) Reduce the number of worker threads asynchronously to avoid deadlocks during queries * [redis#7614](RediSearch/RediSearch#7614) Track timeout warnings and errors in INFO * [redis#7646](RediSearch/RediSearch#7646) Track `maxprefixexpansions` warnings and errors in INFO * [redis#7577](RediSearch/RediSearch#7577) Track query syntax/argument errors (basis for query error tracking) * [redis#7737](RediSearch/RediSearch#7737) Add `Internal cursor reads` metric to cluster `FT.PROFILE` output * [redis#7759](RediSearch/RediSearch#7759) Extend indexing metrics * [redis#7710](RediSearch/RediSearch#7710) Support `WITHCOUNT` keyword in `FT.AGGREGATE` * [redis#7957](RediSearch/RediSearch#7957) Persist query warnings across cursor reads * [redis#8054](RediSearch/RediSearch#8054) Add logging for index-related commands * [redis#8151](RediSearch/RediSearch#8151) Fix shard total profile time reporting in `FT.PROFILE` * [redis#8103](RediSearch/RediSearch#8103) Output current thread IndexSpec information on crash
**Bug Fixes:** * [#7219](RediSearch/RediSearch#7219) Fix a concurrency issue on Reducer in `FT.AGGREGATE` * [#7255](RediSearch/RediSearch#7255) Fix `BM25STD` underflow wraparound * [#7275](RediSearch/RediSearch#7275) Report used memory as `unsigned long long` to avoid underflows * [#7264](RediSearch/RediSearch#7264) Fix `totalDocsLen` updates * [#6995](RediSearch/RediSearch#6995) Do not fanout `FT.INFO` to replicas * [#7350](RediSearch/RediSearch#7350) Fix `FT.CREATE` failure with LeanVec parameters on non-Intel architectures * [#7694](RediSearch/RediSearch#7694) Use asynchronous jobs for GC in SVS to accelerate execution * [#7384](RediSearch/RediSearch#7384) Fix index load from RDB temporary memory overhead * [#7459](RediSearch/RediSearch#7459) Fix Fork GC potential double-free on error path * [#7458](RediSearch/RediSearch#7458) Fix a GC performence regression * [#7470](RediSearch/RediSearch#7470) Avoid draining worker thread pool from FLUSH callback to avoid deadlocks * [#7554](RediSearch/RediSearch#7554) Handle the case where `SCORE` is sent alone without extra fields (coordinator) * [#7685](RediSearch/RediSearch#7685) Fix cursor logical leak * [#7794](RediSearch/RediSearch#7794) Fix `cmp_strings()` to correctly handle binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [#7873](RediSearch/RediSearch#7873) Handle warnings in empty `FT.AGGREGATE` replies (cluster) * [#7886](RediSearch/RediSearch#7886) Remove non-TEXT fields from the spec keys dictionary * [#7904](RediSearch/RediSearch#7904) Refactor keys dictionary handling * [#7901](RediSearch/RediSearch#7901) Support multiple warnings in reply * [#8083](RediSearch/RediSearch#8083) Fix incorrect FULLTEXT field metric counts * [#8153](RediSearch/RediSearch#8153) Fix configuration registration issues **Improvements:** * [#7154](RediSearch/RediSearch#7154) `FT.AGGREGATE` can return Background Indexing OOM warnings * [#7083](RediSearch/RediSearch#7083) Add the default text scorer as a configuration option * [#7341](RediSearch/RediSearch#7341) Rename `FT.PROFILE` counter fields * [#7436](RediSearch/RediSearch#7436) Enhance `FT.PROFILE` with vector search execution details * [#7435](RediSearch/RediSearch#7435) Ensure full `FT.PROFILE` output on timeout with RETURN policy * [#7534](RediSearch/RediSearch#7534) Reduce the number of worker threads asynchronously to avoid deadlocks during queries * [#7614](RediSearch/RediSearch#7614) Track timeout warnings and errors in INFO * [#7646](RediSearch/RediSearch#7646) Track `maxprefixexpansions` warnings and errors in INFO * [#7577](RediSearch/RediSearch#7577) Track query syntax/argument errors (basis for query error tracking) * [#7737](RediSearch/RediSearch#7737) Add `Internal cursor reads` metric to cluster `FT.PROFILE` output * [#7759](RediSearch/RediSearch#7759) Extend indexing metrics * [#7710](RediSearch/RediSearch#7710) Support `WITHCOUNT` keyword in `FT.AGGREGATE` * [#7957](RediSearch/RediSearch#7957) Persist query warnings across cursor reads * [#8054](RediSearch/RediSearch#8054) Add logging for index-related commands * [#8151](RediSearch/RediSearch#8151) Fix shard total profile time reporting in `FT.PROFILE` * [#8103](RediSearch/RediSearch#8103) Output current thread IndexSpec information on crash
backport #7414 to 8.2
Note
Adds tracking and exposure of timeout errors/warnings for queries (shard and coordinator) and updates error handling to use explicit codes.
QueryErrorsGlobalStatsandQueryWarningGlobalStats, with helpers to update/read them.QueryError_GetCodeFromMessage,QueryWarningCode_*).warningarray and track each warning.shard_total_query_errors_timeout,coord_total_query_errors_timeout,shard_total_query_warnings_timeout,coord_total_query_warnings_timeout.execCommandCommonbased on shard count.Written by Cursor Bugbot for commit d64876c. This will update automatically on new commits. Configure here.