Make readQueryFromClient more aggressive when reading big arg again (Followup for #9003)#9100
Merged
oranagra merged 2 commits intoredis:unstablefrom Jun 17, 2021
Merged
Conversation
oranagra
approved these changes
Jun 17, 2021
JackieXie168
pushed a commit
to JackieXie168/redis
that referenced
this pull request
Sep 8, 2021
…Followup for redis#9003) (redis#9100) Due to the change in redis#9003, a long-standing bug was raised under `valgrind`. This bug can cause the master-slave sync to take a very long time, causing the `pendingquerybuf.tcl` test to fail. This problem does not only occur in master-slave sync, it is triggered when the big arg is greater than 32k. step: ```sh dd if=/dev/zero of=bigfile bs=1M count=32 ./src/redis-cli -x hset a a < bigfile ``` 1) Make room for querybuf in processMultibulkBuffer, now the alloc of querybuf will be more than 32k. 2) If this happens to trigger the `clientsCronResizeQueryBuffer`, querybuf will be resized to 0. 3) Finally, in readQueryFromClient, we expand the querybuf non-greedily, from 0 to 32k. Old code, make room for querybuf is greedy, so it only needs 11 times to expand to 32M(16k*(2^11)), but now we need 2048(32*1024/16) times to reach it, due to the slow allocation under valgrind that exposed the problem. The fix for the excessive shrinking of the query buf to 0, will be handled in redis#5013 (that other change on it's own can fix failing test too), but the fix in this PR will also fix the failing test. The fix in this PR will makes the reading in `readQueryFromClient` more aggressive when working on a big arg (so that it is in par with the same code in `processMultibulkBuffer` (i.e. the two calls to `sdsMakeRoomForNonGreedy` should both use the bulk size). In the code before this fix the one in readQueryFromClient always has `readlen = PROTO_IOBUF_LEN`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@oranagra @yoav-steinberg
Due to the change in #9003, a long-standing bug was raised under
valgrind.This bug can cause the master-slave sync to take a very long time, causing the
pendingquerybuf.tcltest to fail.This problem does not only occur in master-slave sync, it is triggered when the big arg is greater than 32k.
step:
dd if=/dev/zero of=bigfile bs=1M count=32 ./src/redis-cli -x hset a a < bigfileredis/src/networking.c
Line 2159 in b586d5b
clientsCronResizeQueryBuffer, querybuf will be resized to 0.Old code, make room for querybuf is greedy, so it only needs 11 times to expand to 32M(16k*(2^11)), but now we need 2048(32*1024/16) times to reach it, due to the slow allocation under valgrind tha exposed the problem.
The fix for the excessive shrinking of the query buf to 0, will be handled in #5013 (that other change on it's own can fix failing test too), but the fix in this PR will also fix the failing test.
The fix in this PR will make the reading in
readQueryFromClientmore aggressive when working on a big arg (so that it is in par with the same code inprocessMultibulkBuffer(i.e. the two calls tosdsMakeRoomForNonGreedyshould both use the bulk size).In the current code (before this fix) the one in readQueryFromClient always has
readlen = PROTO_IOBUF_LEN