Bug fixes related to Timeout with return partial results policy#5556
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## meiravg_debug_query_command #5556 +/- ##
===============================================================
+ Coverage 87.63% 87.85% +0.22%
===============================================================
Files 196 196
Lines 35592 35603 +11
===============================================================
+ Hits 31190 31278 +88
+ Misses 4402 4325 -77 ☔ View full report in Codecov by Sentry. |
src/coord/rmr/rmr.c
Outdated
| // Unblock the channel before calling `ProcessDone`, as it may trigger the freeing of the iterator | ||
| MRChannel_Unblock(ctx->it->ctx.chan); | ||
| MRIterator_Release(ctx->it); |
There was a problem hiding this comment.
- Comment is incorrect
- As discussed, this is a race condition and we should unblock the channel after attempting to release the iterator (setting the freeing flag). IMO the call to unblock should be part of the release flow, but we can also make the
MRIterator_Releasecall return a value to indicate whether we freed the iterator or not, and call the unblock accordingly
raz-mon
left a comment
There was a problem hiding this comment.
GREAT work!
Partial review (since more changes are on the way)
| pthread_mutex_lock(&chan->lock); | ||
| while (!chan->size) { | ||
| if (!chan->wait) { | ||
| chan->wait = true; // reset the flag |
There was a problem hiding this comment.
Since we now necessarily only unblock a single reader, should we change the broadcast call to a signal in the unblock function?
There was a problem hiding this comment.
what are the risks? what is the benefit?
There was a problem hiding this comment.
future-proofing, if we ever want to have multiple readers. We now unblock a single reader, so why wake everyone up?
| // Unblock the channel before calling `ProcessDone`, as it may trigger the freeing of the iterator | ||
| MRChannel_Unblock(ctx->it->ctx.chan); | ||
| } | ||
| RS_LOG_ASSERT(pending >= 0, "Pending should not reach a negative value"); |
There was a problem hiding this comment.
Can we replace this with a simple assert call?
There was a problem hiding this comment.
what's the diffrence?
There was a problem hiding this comment.
Hopefully, soon, we will have dev-time and prod-time assertions, this one should be a dev-time assertion IMO, so no need for a redis module API assertion
There was a problem hiding this comment.
why do you think this is a dev time assertion?
I suggest replacing it with a simple assert + log message so we won't crush (here) on production, WDYT?
tests/pytests/test_cursors.py
Outdated
|
|
||
| # Ensure the cursor is properly depleted after one FT.CURSOR READ | ||
| iter = 0 | ||
| while (cursor): |
There was a problem hiding this comment.
why use a while loop if we expect a single read?
There was a problem hiding this comment.
To simulate the scenario we don't expect and have debugging info if it happens
There was a problem hiding this comment.
so why not assert that we get cursor id 0 from the read, and add meaningful info if it fails?
|
Runnning on all linux platform: The test is flaky because the timeout is not always triggered- depends on the timing of the query execution. Now, after the bug fix, when the timeout happens, no results are returned, causing the test to fail as it expects the results count to be greater than 0. I'll fix the test to enforce timeout by calling the debug query mechanism. |
add test
fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each
before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result.
that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount
… N==0 to avoid coordinator infinite loop fix missing return in debug commands
the problem was that we were trying to decrese &ctx->it->ctx.inProcess when the shard decided to release the iterator (MRIterator_Release released it) The solution is to change MRIterator_Release signature. Now it returns weather it freed the iterator or not. if it was released, return
…itions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block
…imulate well the scenario: i.e -the shard returns timeout error string, butthe coordinator clock is not set to be timeout. Trying to only return an error from the shards without any special changes in the coordinator: RPNetNext returns error (and exits the while loop) only if the reply contains timeout AND the policy is to fail. otherwise, we enter another while loop iteration without processing the reply, overriding it with the next loop. This is a leak. Another approach i took was: in the callback passed to the resultprocessor I tried to first call getnextreply before rpnetNext checks the timeout, but it didn't work since for some reason i coudnlt figure (the shards didn't shut down. mught be a bug in RLTest) Returning a generic query code soen't simulate the scenario required.
skip testEmptyResult in coord
revmove sortable
…itions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block
1e9e5a5 to
be8eae2
Compare
raz-mon
left a comment
There was a problem hiding this comment.
Really nice!
Kudos for the super clear docs all around!
Bravo 👏
|
|
||
| # This simulates a scenario where shards return empty results due to timeout (so the cursor is still valid), | ||
| # but the coordinator managed to call 'getNextReply' and waits for replies in MRChannel_Pop, before it checked timeout. | ||
| # Note: An empty reply does not wake up the coordinator. |
There was a problem hiding this comment.
Now it does, use did for clarity
There was a problem hiding this comment.
OK it's clear from the rest as well
There was a problem hiding this comment.
It still doesn't. the channel will be unblocked because all the shards are done sending the current poll of results.
simplify testCursorDepletionNonStrictTimeoutPolicySortby test
* reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * Add query debug commands: API: _FT.DEBUG <query> <DEBUG_TYPE> <DEBUG_TYPE_ARGS> ... DEBUG_PARAMS_COUNT 2 query can be either FT.AGGREGATE ... or FT.SEARCH ... (with all supported args) DEBUG_TYPE Currently supports TIMEOUT_AFTER_N, followed by the number of results to return before timeout is returned, including 0. can be called both in SA and cluster: _FT.DEBUG FT.AGGREGATE _FT.DEBUG FT.SEARCH Can be called only if num shards is 1: _FT.DEBUG _FT.AGGREGATE _FT.DEBUG _FT.SEARCH Changes taking place in production functions: module.c *if in `DistSearchCommand` (ft.search) and `DistAggregateCommand` (ft.aggregate) to check if it's a debug command * if debug and 1 shard -> call DEBUG_RSSearchCommand or DEBUG_RSAggregateCommand for search and aggregate, respectively. *if debug and cluster: call DEBUG_DistSearchCommandHandler or DEBUG_RSExecDistAggregate for search and aggregate, respectively. * split FlatSearchCommandHandler into: * CreateReq * prepareCommand * MR_CreateCtx * MRCtx_SetReduceFunction * MR_Fanout * DEBUG_FlatSearchCommandHandler flow; * CreateReq * prepareCommand * add _FT.DEBUG as first arg * add debug params at the end * expose DistAggregateCommandand DistSearchCommand so it can be called from debug_commands.c aggregate_exec.c: split execCommandCommon into * prepareRequest * buildPipelineAndExecute check if AREQ->flags & QEXEC_F_DEBUG in prepareExecutionPlan change arg withProfile to execOptions change profile options to bit flags remove NO_PROFILE and replace with a flag indicating this is a profile command EXEC_WITH_PROFILE than if its profile, checking if EXEC_WITH_PROFILE_LIMITED bit is set. parseProfile also gets execOptions *DEBUG_* command are defined in debug_commands.h New: RPTimeoutAfterCount_New a result processor sumulates timeout after n results. AREQ_Debug_New create AREQ_Debug holding a request by value. hence freed toghther. new AREQ flag : QEXEC_F_DEBUG * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * finalize test debug comman * add GetNumShards_UnSafe to read number of shards use in in debug command instead of havong an if in the production function * add sanity test * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * remove bug fixes 1. sorter 2. pager 3. resp3 change rc in RS_RESULT_TIMEDOUT 4. coord hangign tests: revert rest cursors with _FT.DEBUG QUERY remove new test_timeout * add timeout build mechanism * set non strict mode to debug tests isClusterCoord to check if we are the coordinator in cluster mode * REVERT timeout on build * Bug fixes related to Timeout with return partial results policy (#5556) * reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * disable testTimeoutPartialWithEmptyResults test * enforce timeout when calling INTERNAL_ONLY with TIMEOUT_AFTER_N where N==0 to avoid coordinator infinite loop fix missing return in debug commands * fix tests: explictly settimeout policy to on_timeout return * fix use after free in rmr.c:MRIteratorCallback_ProcessDone the problem was that we were trying to decrese &ctx->it->ctx.inProcess when the shard decided to release the iterator (MRIterator_Release released it) The solution is to change MRIterator_Release signature. Now it returns weather it freed the iterator or not. if it was released, return * return false if free flag is set to false * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * fix leak * simulate timeout during build - TO BE REVERTED as currently doesn't simulate well the scenario: i.e -the shard returns timeout error string, butthe coordinator clock is not set to be timeout. Trying to only return an error from the shards without any special changes in the coordinator: RPNetNext returns error (and exits the while loop) only if the reply contains timeout AND the policy is to fail. otherwise, we enter another while loop iteration without processing the reply, overriding it with the next loop. This is a leak. Another approach i took was: in the callback passed to the resultprocessor I tried to first call getnextreply before rpnetNext checks the timeout, but it didn't work since for some reason i coudnlt figure (the shards didn't shut down. mught be a bug in RLTest) Returning a generic query code soen't simulate the scenario required. * REVERT tieout on buld * enable test_cursors:testTimeoutPartialWithEmptyResults skip testEmptyResult in coord * fix comment revmove sortable * stable test_error_with_partial_results by using query debug * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * signal one reader simplify testCursorDepletionNonStrictTimeoutPolicySortby test * fix typo * revert force failure * fix spelling errors * review fixes: remove load from AggregateDebug test introduce DistAggregateCleanups to handle `go to err` in DEBUG_RSExecDistAggregate and RSExecDistAggregate comments in PipelineAddTimeoutAfterCount * move debug things to a header * alaborate comment * more details in aggregate_debug.h refrence cursor test to the header modify and add some comments * fix comment in rp use cursor to parse dewbug args fix * disable internal only when used with 0 results and no user cursor * remove DUMMY_DEBUG_OPTION * fix
* reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * Add query debug commands: API: _FT.DEBUG <query> <DEBUG_TYPE> <DEBUG_TYPE_ARGS> ... DEBUG_PARAMS_COUNT 2 query can be either FT.AGGREGATE ... or FT.SEARCH ... (with all supported args) DEBUG_TYPE Currently supports TIMEOUT_AFTER_N, followed by the number of results to return before timeout is returned, including 0. can be called both in SA and cluster: _FT.DEBUG FT.AGGREGATE _FT.DEBUG FT.SEARCH Can be called only if num shards is 1: _FT.DEBUG _FT.AGGREGATE _FT.DEBUG _FT.SEARCH Changes taking place in production functions: module.c *if in `DistSearchCommand` (ft.search) and `DistAggregateCommand` (ft.aggregate) to check if it's a debug command * if debug and 1 shard -> call DEBUG_RSSearchCommand or DEBUG_RSAggregateCommand for search and aggregate, respectively. *if debug and cluster: call DEBUG_DistSearchCommandHandler or DEBUG_RSExecDistAggregate for search and aggregate, respectively. * split FlatSearchCommandHandler into: * CreateReq * prepareCommand * MR_CreateCtx * MRCtx_SetReduceFunction * MR_Fanout * DEBUG_FlatSearchCommandHandler flow; * CreateReq * prepareCommand * add _FT.DEBUG as first arg * add debug params at the end * expose DistAggregateCommandand DistSearchCommand so it can be called from debug_commands.c aggregate_exec.c: split execCommandCommon into * prepareRequest * buildPipelineAndExecute check if AREQ->flags & QEXEC_F_DEBUG in prepareExecutionPlan change arg withProfile to execOptions change profile options to bit flags remove NO_PROFILE and replace with a flag indicating this is a profile command EXEC_WITH_PROFILE than if its profile, checking if EXEC_WITH_PROFILE_LIMITED bit is set. parseProfile also gets execOptions *DEBUG_* command are defined in debug_commands.h New: RPTimeoutAfterCount_New a result processor sumulates timeout after n results. AREQ_Debug_New create AREQ_Debug holding a request by value. hence freed toghther. new AREQ flag : QEXEC_F_DEBUG * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * finalize test debug comman * add GetNumShards_UnSafe to read number of shards use in in debug command instead of havong an if in the production function * add sanity test * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * remove bug fixes 1. sorter 2. pager 3. resp3 change rc in RS_RESULT_TIMEDOUT 4. coord hangign tests: revert rest cursors with _FT.DEBUG QUERY remove new test_timeout * add timeout build mechanism * set non strict mode to debug tests isClusterCoord to check if we are the coordinator in cluster mode * REVERT timeout on build * Bug fixes related to Timeout with return partial results policy (#5556) * reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * disable testTimeoutPartialWithEmptyResults test * enforce timeout when calling INTERNAL_ONLY with TIMEOUT_AFTER_N where N==0 to avoid coordinator infinite loop fix missing return in debug commands * fix tests: explictly settimeout policy to on_timeout return * fix use after free in rmr.c:MRIteratorCallback_ProcessDone the problem was that we were trying to decrese &ctx->it->ctx.inProcess when the shard decided to release the iterator (MRIterator_Release released it) The solution is to change MRIterator_Release signature. Now it returns weather it freed the iterator or not. if it was released, return * return false if free flag is set to false * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * fix leak * simulate timeout during build - TO BE REVERTED as currently doesn't simulate well the scenario: i.e -the shard returns timeout error string, butthe coordinator clock is not set to be timeout. Trying to only return an error from the shards without any special changes in the coordinator: RPNetNext returns error (and exits the while loop) only if the reply contains timeout AND the policy is to fail. otherwise, we enter another while loop iteration without processing the reply, overriding it with the next loop. This is a leak. Another approach i took was: in the callback passed to the resultprocessor I tried to first call getnextreply before rpnetNext checks the timeout, but it didn't work since for some reason i coudnlt figure (the shards didn't shut down. mught be a bug in RLTest) Returning a generic query code soen't simulate the scenario required. * REVERT tieout on buld * enable test_cursors:testTimeoutPartialWithEmptyResults skip testEmptyResult in coord * fix comment revmove sortable * stable test_error_with_partial_results by using query debug * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * signal one reader simplify testCursorDepletionNonStrictTimeoutPolicySortby test * fix typo * revert force failure * fix spelling errors * review fixes: remove load from AggregateDebug test introduce DistAggregateCleanups to handle `go to err` in DEBUG_RSExecDistAggregate and RSExecDistAggregate comments in PipelineAddTimeoutAfterCount * move debug things to a header * alaborate comment * more details in aggregate_debug.h refrence cursor test to the header modify and add some comments * fix comment in rp use cursor to parse dewbug args fix * disable internal only when used with 0 results and no user cursor * remove DUMMY_DEBUG_OPTION * fix (cherry picked from commit d45701e)
* Introducing query debug mechanism 🎉 (#5554) * reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * Add query debug commands: API: _FT.DEBUG <query> <DEBUG_TYPE> <DEBUG_TYPE_ARGS> ... DEBUG_PARAMS_COUNT 2 query can be either FT.AGGREGATE ... or FT.SEARCH ... (with all supported args) DEBUG_TYPE Currently supports TIMEOUT_AFTER_N, followed by the number of results to return before timeout is returned, including 0. can be called both in SA and cluster: _FT.DEBUG FT.AGGREGATE _FT.DEBUG FT.SEARCH Can be called only if num shards is 1: _FT.DEBUG _FT.AGGREGATE _FT.DEBUG _FT.SEARCH Changes taking place in production functions: module.c *if in `DistSearchCommand` (ft.search) and `DistAggregateCommand` (ft.aggregate) to check if it's a debug command * if debug and 1 shard -> call DEBUG_RSSearchCommand or DEBUG_RSAggregateCommand for search and aggregate, respectively. *if debug and cluster: call DEBUG_DistSearchCommandHandler or DEBUG_RSExecDistAggregate for search and aggregate, respectively. * split FlatSearchCommandHandler into: * CreateReq * prepareCommand * MR_CreateCtx * MRCtx_SetReduceFunction * MR_Fanout * DEBUG_FlatSearchCommandHandler flow; * CreateReq * prepareCommand * add _FT.DEBUG as first arg * add debug params at the end * expose DistAggregateCommandand DistSearchCommand so it can be called from debug_commands.c aggregate_exec.c: split execCommandCommon into * prepareRequest * buildPipelineAndExecute check if AREQ->flags & QEXEC_F_DEBUG in prepareExecutionPlan change arg withProfile to execOptions change profile options to bit flags remove NO_PROFILE and replace with a flag indicating this is a profile command EXEC_WITH_PROFILE than if its profile, checking if EXEC_WITH_PROFILE_LIMITED bit is set. parseProfile also gets execOptions *DEBUG_* command are defined in debug_commands.h New: RPTimeoutAfterCount_New a result processor sumulates timeout after n results. AREQ_Debug_New create AREQ_Debug holding a request by value. hence freed toghther. new AREQ flag : QEXEC_F_DEBUG * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * finalize test debug comman * add GetNumShards_UnSafe to read number of shards use in in debug command instead of havong an if in the production function * add sanity test * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * remove bug fixes 1. sorter 2. pager 3. resp3 change rc in RS_RESULT_TIMEDOUT 4. coord hangign tests: revert rest cursors with _FT.DEBUG QUERY remove new test_timeout * add timeout build mechanism * set non strict mode to debug tests isClusterCoord to check if we are the coordinator in cluster mode * REVERT timeout on build * Bug fixes related to Timeout with return partial results policy (#5556) * reset self->timedOut add test * revert otiginal test * suuport empty results (to bereverted) * revert getting empty result fix bug 1: aggregate_exec:sendChunk_Resp3 Don't change rc upon timeout, to avoid returning empty results. bug fix2: reset sorter timeout so we retrun rof after depleting the heap. bug fix3: if shards didn't send any reply (curerntly assuming only possible due to timeout), but they did finish collecting the results, than inprocess would set to 0, but pending is still larger than 0, and the cursor is not depleted. in this case, we unblock the chanel, signle it to proceed. we rely on the coordinator to check the timeout. If it's timeout it will return. to allow that, get next reply returns pending (number of shard that potentially have results) instead of 0. in this case rpnet won't return EOF, but timeout. ehancement: rpnetNext cache results len instead of getting it each * revert result len * revert resp2 reply length * fix rppagerNext_Limit counter update before this fix self->remaining was decreased and than base->upstream->Next(base->upstream, r) was returned but the ->Next call could have returned no results, so the counter was wrongly updated for it. This caused wrong number of accumelated results retuned by cursor, when timeout was reached in one or more of the cursor's read command. In each timedout read we lost one result. * add optional paramter: [INTERNAL_ONLY] that will add the timeout Rp only if the query runs in a shard add tst_timeout add to debug tests: TeststrictPolicy testInternalOnly add test_cursors:testTimeoutPartialWithEmptyResults rename parseDebugParams->parseDebugParamsCount * finalize tests * fix comment * disable testTimeoutPartialWithEmptyResults test * enforce timeout when calling INTERNAL_ONLY with TIMEOUT_AFTER_N where N==0 to avoid coordinator infinite loop fix missing return in debug commands * fix tests: explictly settimeout policy to on_timeout return * fix use after free in rmr.c:MRIteratorCallback_ProcessDone the problem was that we were trying to decrese &ctx->it->ctx.inProcess when the shard decided to release the iterator (MRIterator_Release released it) The solution is to change MRIterator_Release signature. Now it returns weather it freed the iterator or not. if it was released, return * return false if free flag is set to false * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * fix leak * simulate timeout during build - TO BE REVERTED as currently doesn't simulate well the scenario: i.e -the shard returns timeout error string, butthe coordinator clock is not set to be timeout. Trying to only return an error from the shards without any special changes in the coordinator: RPNetNext returns error (and exits the while loop) only if the reply contains timeout AND the policy is to fail. otherwise, we enter another while loop iteration without processing the reply, overriding it with the next loop. This is a leak. Another approach i took was: in the callback passed to the resultprocessor I tried to first call getnextreply before rpnetNext checks the timeout, but it didn't work since for some reason i coudnlt figure (the shards didn't shut down. mught be a bug in RLTest) Returning a generic query code soen't simulate the scenario required. * REVERT tieout on buld * enable test_cursors:testTimeoutPartialWithEmptyResults skip testEmptyResult in coord * fix comment revmove sortable * stable test_error_with_partial_results by using query debug * move reader reset of the freeFlag to the uv thread to avoid race conditions with the shards. ensure the sequence of returning from MRChannel_Popand re - blocking the chanel is atomic under the chanel guard remove MRChannel_Block * signal one reader simplify testCursorDepletionNonStrictTimeoutPolicySortby test * fix typo * revert force failure * fix spelling errors * review fixes: remove load from AggregateDebug test introduce DistAggregateCleanups to handle `go to err` in DEBUG_RSExecDistAggregate and RSExecDistAggregate comments in PipelineAddTimeoutAfterCount * move debug things to a header * alaborate comment * more details in aggregate_debug.h refrence cursor test to the header modify and add some comments * fix comment in rp use cursor to parse dewbug args fix * disable internal only when used with 0 results and no user cursor * remove DUMMY_DEBUG_OPTION * fix (cherry picked from commit d45701e) * fix
This PR addresses multiple bugs that occur when the timeout policy is return partial results.
MOD-8483 Bug 1: Cursor with
SORTBYis never depletedWhen a cursor read with a sorter encounters a timeout during the sorter’s accumulation phase, the sorter switches to yield mode and sets its
timedOutflag.After this, it begins emptying its heap, returning the results collected up to the timeout.
During this process, each returned result has the status
RS_RESULT_OK.Once the heap is emptied, the sorter returns
RS_RESULT_TIMEDOUT, and the current read command is done, with a valid cursor id.At this point, every subsequent
FT.CURSOR READcall receivesRS_RESULT_TIMEDOUT,even though no more results are available. Since the cursor is only considered depleted when RS_RESULT_EOF is returned, it remains open indefinitely despite having no results left to provide.Fix:
To address this issue, the
timedOutflag is now reset after the timeout status is returned. This ensures that subsequentFT.CURSOR READcommands returnRS_RESULT_EOF,allowing the cursor to be properly marked as depleted once all results have been processed.MOD-8482 Bug 2: Serializing empty result with resp3
in
sendChunk_Resp3, During the initial execution of the query pipeline, if a timeout occurred before any results were successfully collected, the return code (rc) was incorrectly changed fromRS_RESULT_TIMEDOUTtoRS_RESULT_OKThis caused an empty result to be serialized and incorrectly returned as a valid result.Additionally, in standalone mode (SA), the
RP_INDEXresult processor checks for timeouts every 100 results. If a timeout occurred during the first pipeline run, the timeout counter ofRP_INDEXwas reset to 0, allowing up to 99 valid results to be returned before the next timeout check, thereby violating the timeout limit.For subsequent pipeline runs, the rc for
RS_RESULT_TIMEDOUTwas handled correctly and not altered.Fix:
The fix ensures that the rc is not changed when a timeout occurs.
MOD-8515 Bug 3:
FT.AGGREGATE WITHCURSORhangs when shards return 0 results, but cursor is not depletedWhen shards return an empty results array (length 0),
MRIteratorCallback_AddReplyis not called to push the result to the channel, leaving the channel waiting indefinitely inMRChannel_Pop.Additionally, since the cursor is not depleted
the MRIteratorCallback_Donefunction—which decreases the pending count and callsMRChannel_Unblockto wake the waitingMRChannel_Pop—is skipped.Instead,
MRIteratorCallback_ProcessDoneis called, completing the results collection from the shard but leavingMRChannel_Popin a hanging state.Fix:
Move
MRChannel_Unblockcall toMRIteratorCallback_ProcessDone, ensuring it is executed even when the cursor is not depleted. This change releases the conditional wait inMRChannel_Pop, preventing the hang.Additionally, when no results are returned from the shards, we avoid returning EOF. Instead, a valid cursor ID is returned gracefully, allowing the next
FT.CURSOR READto attempt to collect results.MOD-8606 Bug 4:
RP_Pagerincorrectly updates its counter when no results are receivedWhen used with cursors, this behavior causes the counter to be incorrectly updated for each
FT.CURSOR READthat times out. As a result, the total number of results returned by all cursor reads is less than the requestedLIMIT, specifically reduced by the number ofFT.CURSOR READcommands that encountered a timeout.for example:
Fix:
Ensure that the counter is only updated when the upstream RPs return
RS_RESULT_OK. This prevents erroneous decrements when no results are received, maintaining the correct count and ensuring the LIMIT is respected.MOD-8794 Bug 5: freeFlag race condition
There is a race condition on
it->ctx.freeFlagchecked inMRIterator_Releaseto trigger an iterator release process in the following scenario:shard thread: a shard decreases
inProcessatomic counter. The counter reaches 0.coord thread: now the coordinator (reader) reaches
MR_ManuallyTriggerNextIfNeeded, sinceinProcess == 0and it->ctx.pending > 0, is setsit->ctx.freeFlag = falseand triggers another cursor commands to the shardsshard thread: since
nProcess == 0, it calls MRIterator_Release, setsit->ctx.freeFlag = trueand returns immediately sinceit->ctx.freeFlagwas false.now
it->ctx.freeFlag = trueso next shard to reach MRIterator_Release will release it prematurely.Fix:
move
it->ctx.freeFlag = falsefrom the coordinator (reader) thread to the uv thread, ensuring it happens sequentially after the shards (writers) are all done.MOD-8792 (Not related to timeout) Wrong string length of command arg in
FlatSearchCommandHandlerMRCommand_Insert expects to receive the length of the string written into n_prefixes.
instead, we return the size of the pointer variable.
Fix:
rm_asprintf returns the number of bytes printed (excluding the null terminator)
use this to pass the length of the command string.