Fix MRIterator ownership mechanism - [MOD-8108]#5244
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5244 +/- ##
==========================================
+ Coverage 86.60% 86.71% +0.11%
==========================================
Files 193 193
Lines 34772 34671 -101
==========================================
- Hits 30113 30065 -48
+ Misses 4659 4606 -53 ☔ View full report in Codecov by Sentry. |
Automated performance analysis summaryThis comment was automatically generated given there is performance data available. In summary:
You can check a comparison in detail via the grafana link Comparison between master and guyav-fix_cursor_deadlock.Time Period from 30 days ago. (environment used: oss-standalone)
|
raz-mon
left a comment
There was a problem hiding this comment.
AWESOME 🔥
Some small comments
raz-mon
left a comment
There was a problem hiding this comment.
💣
Consider naming the blocking variable block instead of wait, a bit more intuitive.
Great addition!
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.8
git worktree add -d .worktree/backport-5244-to-2.8 origin/2.8
cd .worktree/backport-5244-to-2.8
git switch --create backport-5244-to-2.8
git cherry-pick -x 6f68d69d7688807bad5b89eb53673aaaff36c9c9 |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 2.6
git worktree add -d .worktree/backport-5244-to-2.6 origin/2.6
cd .worktree/backport-5244-to-2.6
git switch --create backport-5244-to-2.6
git cherry-pick -x 6f68d69d7688807bad5b89eb53673aaaff36c9c9 |
* implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69)
|
Successfully created backport PR for |
* implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69)
|
Successfully created backport PR for |
* implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69)
* implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69)
* Fix MRIterator ownership mechanism - [MOD-8108] (#5244) * implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69) * Fix a potential use after free - [MOD-8108] (#5268) fix a potential use after free --------- Co-authored-by: GuyAv46 <[email protected]>
* Fix MRIterator ownership mechanism - [MOD-8108] (#5244) * implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69) * Fix a potential use after free - [MOD-8108] (#5268) fix a potential use after free --------- Co-authored-by: GuyAv46 <[email protected]>
* Fix MRIterator ownership mechanism - [MOD-8108] (#5244) * implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69) * additional fixes for 2.6
* Fix MRIterator ownership mechanism - [MOD-8108] (#5244) * implement release mechanism for MRIterator * refactor MRChannel * delete CursorList_Expire * fix test for async cursor deletion * minor cleanup * more cleanup * remove a log message that is now more likely to be logged * log about stuck RQ less frequently * even less noisy * rename MRChannel_Close (cherry picked from commit 6f68d69) * additional fixes for 2.8
Implement a referencing mechanism between the reader and writers of an
MRIterator, removing the need to wait for replies to arrive before releasing the iterator.This PR fixes a potential deadlock when using cursors in cluster mode, and sometimes speeds up the freeing flow of an
FT.AGGREGATEcommand (no need to sync with the shards).Main objects this PR modified
MRIteratornow has a simple referencing mechanism (using a boolean flag). When both the reader and the writers give up the iterator, the last one to do so frees it. In the case of pending cursors,FT.CURSOR DELcommands will be sent and the iterator will free itself when all the replies return, without blocking the reader.MRChannel, so there is no waiting for it to be closed anymore. It is also possible to push new items to it after it is closed, soMRChannel_Pushcannot fail. The channel and its pending items will be freed when the iterator is freed.CursorList_Expire. The mechanism is no longer needed and we can now free the coordinator's cursors without worrying about deadlocks.#5268 is a follow-up of this PR.
Mark if applicable