[8.2] [MOD-11155] Create basic query pause debug command#7579
Conversation
* feat: add unified memory consumption checker for Redis * feat: implement unified memory consumption checker for Redis * refactor: move MIN_NOT_0 macro definition to redis_mem_info.c * fix: include minmax.h for MIN_NOT_0 macro definition * feat: add debug logging for memory usage check in ForkGC * feat: add QueryDebugCtx struct for debugging queries * fix: correct memory usage check and improve memory ratio calculation * fix: update debug command structures for improved clarity and functionality * reorder result_processor.h s.t Debug processors are at the end * reorder result_processor.c s.t Debug processors are at the end * fix: reorder ResultProcessorType enum values for clarity * fix: categorize debug-only result processors for better readability * add basic RP controller debug command * restructer parseAndCompile * basic query pause * remove condition as it might create a race * add RP before type function * remove problematic checks * Add basic pytests * Support insert debug rp before/after specific rp type * test insert debug rp and print rp stream * Add PRINT_RP_STREAM command to FT.DEBUG QUERY_CONTROLLER * Skip cluster test * fix error message * add active queries check * Don't forget the @ * cover invalid rp type * Remove redundant function declaration and add comment * fix TBD comment * change to return found * return error if RP is missing from stream, change to 1 return statement * add new options documentation to aggregate_debug.h * remove doube whitespace * space before var name * remove PAUSE_AFTER_N * change order of args in pause before/after * Test cluster * cluster test * confrim query in bg * Cover error when ran without workers * Guard against insert after Index RP * fixing edge case * fix leak * Apply suggestions from code review Co-authored-by: meiravgri <[email protected]> * Add error handling for resume command when no query is paused * Validate only 1 query is able to run at each time * comment * Add assertion in RPPauseAfterCount_Free to validate debugRP state before clearing * Enhance debug documentation for PAUSE_AFTER_RP_N and PAUSE_BEFORE_RP_N commands to clarify RP type restrictions and timeout behavior. * Remove unused field * add rp str to error msg * remvoe "IN" from error msg * fix test * Fix memory management in RPPauseAfterCount_Free by ensuring rm_free is called consistently. * Global debug ctx API * change expected doc number * Check active queries on all shards * Refactor test_query_controller_pause_and_resume to streamline index creation * Change mechanism in ft.aggregate in cluster * Test coord pause piepline * Enhance error handling for INTERNAL_ONLY usage in debug commands * Round 1 * Remove air conditionar * remove revach * cover invalid count * Move debug RPs to end of file, add debug pause RP (cherry picked from commit 583fee9) * Adjust to new context (cherry picked from commit ed69f6b) * remove duplicate * Re-add debug * remove not thread safe check --------- Co-authored-by: meiravgri <[email protected]> (cherry picked from commit 9d344ab)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 8.2 #7579 +/- ##
==========================================
- Coverage 89.48% 89.48% -0.01%
==========================================
Files 253 253
Lines 41009 41197 +188
Branches 3725 3725
==========================================
+ Hits 36697 36864 +167
- Misses 4263 4284 +21
Partials 49 49
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
* [MOD-11155] Create basic query pause debug command (#6750) * feat: add unified memory consumption checker for Redis * feat: implement unified memory consumption checker for Redis * refactor: move MIN_NOT_0 macro definition to redis_mem_info.c * fix: include minmax.h for MIN_NOT_0 macro definition * feat: add debug logging for memory usage check in ForkGC * feat: add QueryDebugCtx struct for debugging queries * fix: correct memory usage check and improve memory ratio calculation * fix: update debug command structures for improved clarity and functionality * reorder result_processor.h s.t Debug processors are at the end * reorder result_processor.c s.t Debug processors are at the end * fix: reorder ResultProcessorType enum values for clarity * fix: categorize debug-only result processors for better readability * add basic RP controller debug command * restructer parseAndCompile * basic query pause * remove condition as it might create a race * add RP before type function * remove problematic checks * Add basic pytests * Support insert debug rp before/after specific rp type * test insert debug rp and print rp stream * Add PRINT_RP_STREAM command to FT.DEBUG QUERY_CONTROLLER * Skip cluster test * fix error message * add active queries check * Don't forget the @ * cover invalid rp type * Remove redundant function declaration and add comment * fix TBD comment * change to return found * return error if RP is missing from stream, change to 1 return statement * add new options documentation to aggregate_debug.h * remove doube whitespace * space before var name * remove PAUSE_AFTER_N * change order of args in pause before/after * Test cluster * cluster test * confrim query in bg * Cover error when ran without workers * Guard against insert after Index RP * fixing edge case * fix leak * Apply suggestions from code review Co-authored-by: meiravgri <[email protected]> * Add error handling for resume command when no query is paused * Validate only 1 query is able to run at each time * comment * Add assertion in RPPauseAfterCount_Free to validate debugRP state before clearing * Enhance debug documentation for PAUSE_AFTER_RP_N and PAUSE_BEFORE_RP_N commands to clarify RP type restrictions and timeout behavior. * Remove unused field * add rp str to error msg * remvoe "IN" from error msg * fix test * Fix memory management in RPPauseAfterCount_Free by ensuring rm_free is called consistently. * Global debug ctx API * change expected doc number * Check active queries on all shards * Refactor test_query_controller_pause_and_resume to streamline index creation * Change mechanism in ft.aggregate in cluster * Test coord pause piepline * Enhance error handling for INTERNAL_ONLY usage in debug commands * Round 1 * Remove air conditionar * remove revach * cover invalid count * Move debug RPs to end of file, add debug pause RP (cherry picked from commit 583fee9) * Adjust to new context (cherry picked from commit ed69f6b) * remove duplicate * Re-add debug * remove not thread safe check --------- Co-authored-by: meiravgri <[email protected]> (cherry picked from commit 9d344ab) * fix compilation * fix
| if (internal_only) { | ||
| QueryError_SetError(status, QUERY_EPARSEARGS, | ||
| "INTERNAL_ONLY must be used with TIMEOUT_AFTER_N"); | ||
| return REDISMODULE_OK; |
There was a problem hiding this comment.
Bug: Pause arguments silently ignored when timeout is specified
When TIMEOUT_AFTER_N and PAUSE_BEFORE_RP_N/PAUSE_AFTER_RP_N are both specified, the early return REDISMODULE_OK at line 133 causes the pause arguments to be silently ignored. The function returns after processing timeout, never reaching the pause handling code at line 138. Users specifying both arguments would expect either both features to work together or an error indicating they're incompatible, not one argument being silently discarded.
Additional Locations (1)
|
/backport |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.10
git worktree add -d .worktree/backport-7579-to-2.10 origin/2.10
cd .worktree/backport-7579-to-2.10
git switch --create backport-7579-to-2.10
git cherry-pick -x e2b405da83c679a356fffde3b506b0ba6b44cbf1 |
backport #6750 to 8.2
Note
Adds PAUSE_* debug args to insert a pause RP before/after specific processors with cluster-aware behavior, plus FT.DEBUG QUERY_CONTROLLER to resume/check/inspect, with validations and comprehensive tests.
PAUSE_AFTER_RP_N <RP_TYPE> <N> [INTERNAL_ONLY]andPAUSE_BEFORE_RP_N <RP_TYPE> <N> [INTERNAL_ONLY]in_FT.DEBUGforFT.SEARCH/FT.AGGREGATE.RP_PAUSE(RPPauseAfterCount) andPipelineAddPauseRPcount(...)to inject a pause RP before/after the first matchingRP_TYPE(validates type/existence, disallows last-RP after-insert).QueryDebugCtx(single paused query), with APIs to set/get pause state and the active debug RP.INTERNAL_ONLYsemantics forFT.AGGREGATE, forbidCRASHwithINTERNAL_ONLY, and error on standaloneINTERNAL_ONLY.StringToRPType(...)and expandResultProcessorTypewithRP_PAUSE.FT.DEBUG QUERY_CONTROLLERsubcommands:SET_PAUSE_RP_RESUME,GET_IS_RP_PAUSED,PRINT_RP_STREAMto control and inspect paused queries.TIMEOUT_AFTER_Nlogic; force coordinator timeout forN==0with disabled timeouts; validate argument combinations.parseDebugQueryCommandArgs, pause/resume utilities) and extensive SA/cluster tests for pause insertion, controller ops, RP stream printing, and error cases.Written by Cursor Bugbot for commit b5e1720. This will update automatically on new commits. Configure here.