Fix bug in commit_ret_elems which lead to "live leak" in case of requests batch.#278
Fix bug in commit_ret_elems which lead to "live leak" in case of requests batch.#278greensky00 merged 3 commits intoeBay:masterfrom
commit_ret_elems which lead to "live leak" in case of requests batch.#278Conversation
Merge with upstream
Merge upstream
…atch Fix bug which lead to memory leak (cherry picked from commit 1707a75)
| /// condition) we have to add it here. Otherwise we don't need to add | ||
| /// anything into commit_ret_elems_, because nobody will wait for the | ||
| /// responses of the intermediate requests from requests batch. | ||
| bool need_to_check_commit_ret = sm_idx == pc_idx; |
There was a problem hiding this comment.
Thanks for this fix. To make sure if I understand this condition correctly, can you confirm the below?
Let's say the current committed index is 10 (for all pre-commit, quick-commit, and sm-commit), and two threads (T1 and T2) are calling handle_cli_req with exactly one log. What will happen is:
- T1 calls
handle_cli_reqfirst,precommit_index_becomes 11. - Before T1 creates the corresponding
commit_ret_elem, the commit thread wakes up earlier. - Since
sm_idx == pc_idxin this case, the commit thread will create it. - However, T2 cannot interfere before step (3), because of the lock held by
handle_cli_req_prelock. T2 can append its log only after thecommit_ret_elemfor 11 is created.
So, if a thread order inversion happens, only one user thread is involved at a time. When the commit thread wakes up, there is no case that there are multiple handle_cli_req calls whose commit_ret_elems are not created yet. Is that right?
There was a problem hiding this comment.
Actually I didn't think about such case, but yes handle_cli_req guarded by mutex (or recursive mutex for some reason...) in handle_cli_req_prelock: https://github.com/ClickHouse-Extras/NuRaft/blob/1707a7572aa66ec5d0a2dbe2bf5effa3352e6b2d/src/handle_client_request.cxx#L40-L53.
There was a problem hiding this comment.
BTW I cannot catch "thread inversion" even under heavy load. But according to the code it should be possible.
There was a problem hiding this comment.
Thanks, let me merge it.
I think the code contains explicit comment. Fixes #239.