Fix PSYNC crash with wrong offset#10243
Merged
oranagra merged 1 commit intoredis:unstablefrom Feb 6, 2022
Merged
Conversation
`PSYNC replicationid str_offset` will crash the server. The reason is in `masterTryPartialResynchronization`, we will call `getLongLongFromObjectOrReply` check the offset. With a wrong offset, it will add a reply and then trigger a full SYNC and the client become a replica. So crash in `c->bufpos == 0 && listLength(c->reply) == 0`. In this commit, we check the psync_offset before entering function `masterTryPartialResynchronization`, and return. Fix redis#10242
oranagra
approved these changes
Feb 6, 2022
Member
oranagra
left a comment
There was a problem hiding this comment.
so this was actually a bug in the PSYNC command, regardless of the assertion, right?
i.e it accepted the sync attempt, but also replied with an error.
In theory, an alternative fix could have been to replace getLongLongFromObjectOrReply with just getLongLongFromObject (remove the OrReply part)
Contributor
Author
|
Yes. And yes we can also remove the reply part. I thought about it too. which one do you prefer? a smaller diff and more elegant one |
Member
|
i prefer the one you implemented, just wanted to be sure i got the details right |
enjoy-binbin
added a commit
to enjoy-binbin/redis
that referenced
this pull request
Feb 11, 2022
Added regression tests for redis#10020 / redis#10081 / redis#10243. The above PRs fixed some crashes due to an asserting, see function `clientHasPendingReplies` (introduced in redis#9166). This commit added some tests to cover the above scenario. These tests will all fail in redis#9166, althought fixed not, there is value in adding these tests to cover and verify the changes. And it also can cover redis#8868 (verify the logs). Other changes: reduces the wait time in `waitForBgsave` and `waitForBgrewriteaof` from 1s to 50ms, which should reduce the time for some tests.
oranagra
pushed a commit
that referenced
this pull request
Feb 13, 2022
Added regression tests for #10020 / #10081 / #10243. The above PRs fixed some crashes due to an asserting, see function `clientHasPendingReplies` (introduced in #9166). This commit added some tests to cover the above scenario. These tests will all fail in #9166, althought fixed not, there is value in adding these tests to cover and verify the changes. And it also can cover #8868 (verify the logs). Other changes: 1. Reduces the wait time in `waitForBgsave` and `waitForBgrewriteaof` from 1s to 50ms, which should reduce the time for some tests. 2. Improve the test infra to print context when `assert_match` fails. 3. Improve the test infra to print `$error` when `assert_error` fails. ``` Expected an error matching 'ERR*' but got 'OK' (context: type eval line 4 cmd {assert_error "ERR*" {r set a b}} proc ::test) ```
oranagra
pushed a commit
that referenced
this pull request
Apr 27, 2022
`PSYNC replicationid str_offset` will crash the server. The reason is in `masterTryPartialResynchronization`, we will call `getLongLongFromObjectOrReply` check the offset. With a wrong offset, it will add a reply and then trigger a full SYNC and the client become a replica. So crash in `c->bufpos == 0 && listLength(c->reply) == 0`. In this commit, we check the psync_offset before entering function `masterTryPartialResynchronization`, and return. Regardless of that crash, accepting the sync, but also replying with an error would have corrupt the replication stream. (cherry picked from commit 344e41c)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PSYNC replicationid str_offsetwill crash the server.The reason is in
masterTryPartialResynchronization,we will call
getLongLongFromObjectOrReplycheck theoffset. With a wrong offset, it will add a reply and
then trigger a full SYNC and the client become a replica.
So crash in
c->bufpos == 0 && listLength(c->reply) == 0.In this commit, we check the psync_offset before entering
function
masterTryPartialResynchronization, and return.Fix #10242
The server logs: