Skip to content

Fix the wrong reisze of querybuf#9003

Merged
oranagra merged 26 commits intoredis:unstablefrom
sundb:fix-querybuf-resize
Jun 15, 2021
Merged

Fix the wrong reisze of querybuf#9003
oranagra merged 26 commits intoredis:unstablefrom
sundb:fix-querybuf-resize

Conversation

@sundb
Copy link
Collaborator

@sundb sundb commented May 29, 2021

The initialize memory of querybuf is PROTO_IOBUF_LEN(1024*16) * 2 (due to sdsMakeRoomFor being greedy), under jemalloc, the allocated memory will be 40k.
This will most likely result in the querybuf being resized when call clientsCronResizeQueryBuffer unless the client requests it fast enough.

Note that this bug existed even before #7875, since the condition for resizing includes the sds headers (32k+6).

Changes

  1. Use non-greedy sdsMakeRoomFor when allocating the initial query buffer (of 16k).
  2. Also use non-greedy allocation when working with BIG_ARG (we won't use that extra space anyway)
  3. in case we did use a greedy allocation, read as much as we can into the buffer we got (including internal frag), to reduce system calls.
  4. introduce a dedicated constant for the shrinking (same value as before)
  5. Add test for querybuf.
  6. improve a maxmemory test by ignoring the effect of replica query buffers (can accumulate many ACKs on slow env)
  7. improve a maxmemory by disabling slowlog (it will cause slight memory growth on slow env).

@sundb sundb marked this pull request as draft May 29, 2021 10:07
@oranagra oranagra added the state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten label May 29, 2021
@oranagra
Copy link
Member

@sundb AFAIR you started to investigate that to solve the replica buffer don't induce eviction test, but i see it fails in this PR.
is that related or unrelated to this test?

@sundb
Copy link
Collaborator Author

sundb commented May 29, 2021

@oranagra This pr solves the problem of replica buffer don't induce eviction test, but now the test fails because my modification causes the memory to cause the slave's querybuf to no longer be shrunk, and when the slave is killed, delta_no_repl does not count the memory added by the querybuf, I'm still testing.

@oranagra
Copy link
Member

oranagra commented May 30, 2021

Let me see if i get the the full story here.

by default when we read a query we use:

    readlen = PROTO_IOBUF_LEN;
....
    c->querybuf = sdsMakeRoomFor(c->querybuf, readlen);

PROTO_IOBUF_LEN is 16k, and sdsMakeRoomFor being greedy doubles that.
so that, together with the (not so recent) change in #7864 #7875. caused the sds size to be 48k (instead of 32k which it was before)

so when it was combined with this check (PROTO_MBULK_BIG_ARG is set to 32k, and using >) and avoid shrinking the default buffer size.

    if (querybuf_size > PROTO_MBULK_BIG_ARG &&

so in fact this is a "regression" from #7864 #7875.

@sundb
Copy link
Collaborator Author

sundb commented May 30, 2021

@sundb sundb marked this pull request as ready for review May 30, 2021 09:32
@oranagra
Copy link
Member

ohh, yeah, that's the PR i meant (took the wrong one since they had a similar title)

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i wasn't paying close attention so far while you were investigating failures in this test. only now i bothered to read my test and remember what it was attempting to achieve.

@oranagra
Copy link
Member

oranagra commented Jun 3, 2021

i take it back, this is not a regression from #7875.
before 7875, we used to ask for 16k, and the greenness of sdsMakeRoomFor gave us 32k, but since serverCron uses sdsAllocSize rather than sdsalloc (what #5013 wants to solve), it would readk 32k+6, so the fact the code there uses > and not >= doesn't help.

so even before 7875 we would have shrieked the query buff right after it was allocated.

@sundb
Copy link
Collaborator Author

sundb commented Jun 3, 2021

):8 It's been around for 9 years.

sundb and others added 2 commits June 7, 2021 08:46
Co-authored-by: Oran Agra <[email protected]>
Co-authored-by: Oran Agra <[email protected]>
oranagra
oranagra previously approved these changes Jun 13, 2021
@oranagra
Copy link
Member

@sundb @yoav-steinberg let me know if we're good to merge this one.

@sundb
Copy link
Collaborator Author

sundb commented Jun 15, 2021

@oranagra Sorry, I can't be in front of the PC these days.
I make the following changes.

Copy link
Contributor

@yoav-steinberg yoav-steinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on using sdsavail for updating readlen.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much of these recent changes aren't really doing anything (e.g. the one in scripting.c). we can consider them a cleanup, but maybe since this PR is already quite big and confusing, we wanna leave that cleanup for a separate PR?

the one in tls.c seems to reveal another bug (again i think subject for another PR).
WDYT?

@sundb sundb force-pushed the fix-querybuf-resize branch from 9e8b5b7 to 2099bf1 Compare June 15, 2021 09:32
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so are we all good to merge this now?

@sundb
Copy link
Collaborator Author

sundb commented Jun 15, 2021

@oranagra Yes.

@oranagra oranagra merged commit e5d8a5e into redis:unstable Jun 15, 2021
@sundb sundb mentioned this pull request Jun 16, 2021
oranagra pushed a commit that referenced this pull request Jun 16, 2021
Fix test failure which introduced by #9003.
The following case will occur when querybuf expansion will allocate memory equal to (16*1024)k.
1) make use ```CFLAGS=-DNO_MALLOC_USABLE_SIZE```.
2) ```malloc``` will not allocate more under ```alpine```.
oranagra pushed a commit that referenced this pull request Jun 17, 2021
…Followup for #9003) (#9100)

Due to the change in #9003, a long-standing bug was raised under `valgrind`.
This bug can cause the master-slave sync to take a very long time, causing the `pendingquerybuf.tcl` test to fail.
This problem does not only occur in master-slave sync, it is triggered when the big arg is greater than 32k.
step:
```sh
dd if=/dev/zero of=bigfile bs=1M count=32
./src/redis-cli -x hset a a < bigfile
```

1) Make room for querybuf in processMultibulkBuffer, now the alloc of querybuf will be more than 32k.
2) If this happens to trigger the `clientsCronResizeQueryBuffer`, querybuf will be resized to 0.
3) Finally, in readQueryFromClient, we expand the querybuf non-greedily, from 0 to 32k.
    Old code, make room for querybuf is greedy, so it only needs 11 times to expand to 32M(16k*(2^11)),
    but now we need 2048(32*1024/16) times to reach it, due to the slow allocation under valgrind that exposed the problem.

The fix for the excessive shrinking of the query buf to 0, will be handled in #5013 (that other change on it's own can fix failing test too), but the fix in this PR will also fix the failing test.

The fix in this PR will makes the reading in `readQueryFromClient` more aggressive when working on a big arg (so that it is in par with the same code in `processMultibulkBuffer` (i.e. the two calls to `sdsMakeRoomForNonGreedy` should both use the bulk size).
In the code before this fix the one in readQueryFromClient always has `readlen = PROTO_IOBUF_LEN`
@sundb sundb deleted the fix-querybuf-resize branch June 30, 2021 02:00
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Sep 8, 2021
The initialize memory of `querybuf` is `PROTO_IOBUF_LEN(1024*16) * 2` (due to sdsMakeRoomFor being greedy), under `jemalloc`, the allocated memory will be 40k.
This will most likely result in the `querybuf` being resized when call `clientsCronResizeQueryBuffer` unless the client requests it fast enough.

Note that this bug existed even before redis#7875, since the condition for resizing includes the sds headers (32k+6).

## Changes
1. Use non-greedy sdsMakeRoomFor when allocating the initial query buffer (of 16k).
1. Also use non-greedy allocation when working with BIG_ARG (we won't use that extra space anyway)
2. in case we did use a greedy allocation, read as much as we can into the buffer we got (including internal frag), to reduce system calls.
3. introduce a dedicated constant for the shrinking (same value as before)
3. Add test for querybuf.
4. improve a maxmemory test by ignoring the effect of replica query buffers (can accumulate many ACKs on slow env)
5. improve a maxmemory by disabling slowlog (it will cause slight memory growth on slow env).
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Sep 8, 2021
Fix test failure which introduced by redis#9003.
The following case will occur when querybuf expansion will allocate memory equal to (16*1024)k.
1) make use ```CFLAGS=-DNO_MALLOC_USABLE_SIZE```.
2) ```malloc``` will not allocate more under ```alpine```.
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Sep 8, 2021
…Followup for redis#9003) (redis#9100)

Due to the change in redis#9003, a long-standing bug was raised under `valgrind`.
This bug can cause the master-slave sync to take a very long time, causing the `pendingquerybuf.tcl` test to fail.
This problem does not only occur in master-slave sync, it is triggered when the big arg is greater than 32k.
step:
```sh
dd if=/dev/zero of=bigfile bs=1M count=32
./src/redis-cli -x hset a a < bigfile
```

1) Make room for querybuf in processMultibulkBuffer, now the alloc of querybuf will be more than 32k.
2) If this happens to trigger the `clientsCronResizeQueryBuffer`, querybuf will be resized to 0.
3) Finally, in readQueryFromClient, we expand the querybuf non-greedily, from 0 to 32k.
    Old code, make room for querybuf is greedy, so it only needs 11 times to expand to 32M(16k*(2^11)),
    but now we need 2048(32*1024/16) times to reach it, due to the slow allocation under valgrind that exposed the problem.

The fix for the excessive shrinking of the query buf to 0, will be handled in redis#5013 (that other change on it's own can fix failing test too), but the fix in this PR will also fix the failing test.

The fix in this PR will makes the reading in `readQueryFromClient` more aggressive when working on a big arg (so that it is in par with the same code in `processMultibulkBuffer` (i.e. the two calls to `sdsMakeRoomForNonGreedy` should both use the bulk size).
In the code before this fix the one in readQueryFromClient always has `readlen = PROTO_IOBUF_LEN`
@oranagra oranagra mentioned this pull request Oct 4, 2021
warrick1016 pushed a commit to ctripcorp/Redis-On-Rocks that referenced this pull request Sep 2, 2025
The initialize memory of `querybuf` is `PROTO_IOBUF_LEN(1024*16) * 2` (due to sdsMakeRoomFor being greedy), under `jemalloc`, the allocated memory will be 40k.
This will most likely result in the `querybuf` being resized when call `clientsCronResizeQueryBuffer` unless the client requests it fast enough.

Note that this bug existed even before redis#7875, since the condition for resizing includes the sds headers (32k+6).

1. Use non-greedy sdsMakeRoomFor when allocating the initial query buffer (of 16k).
1. Also use non-greedy allocation when working with BIG_ARG (we won't use that extra space anyway)
2. in case we did use a greedy allocation, read as much as we can into the buffer we got (including internal frag), to reduce system calls.
3. introduce a dedicated constant for the shrinking (same value as before)
3. Add test for querybuf.
4. improve a maxmemory test by ignoring the effect of replica query buffers (can accumulate many ACKs on slow env)
5. improve a maxmemory by disabling slowlog (it will cause slight memory growth on slow env).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants