Check shard_id pointer validity in updateShardId#12538
Merged
madolson merged 1 commit intoredis:unstablefrom Sep 3, 2023
secwall:unstable
Merged
Check shard_id pointer validity in updateShardId#12538madolson merged 1 commit intoredis:unstablefrom secwall:unstable
madolson merged 1 commit intoredis:unstablefrom
secwall:unstable
Conversation
madolson
reviewed
Sep 1, 2023
Contributor
madolson
left a comment
There was a problem hiding this comment.
I think this is comprehensive, but I'm going to test it a bit before merging.
Contributor
|
Ok, I think the best way to test this would have been #10214. Probably worth revisiting implementing this. |
madolson
approved these changes
Sep 3, 2023
Merged
oranagra
pushed a commit
that referenced
this pull request
Sep 6, 2023
When connecting between a 7.0 and 7.2 cluster, the 7.0 cluster will not populate the shard_id field, which is expect on the 7.2 cluster. This is not intended behavior, as the 7.2 cluster is supposed to use a temporary shard_id while the node is in the upgrading state, but it wasn't being correctly set in this case. (cherry picked from commit a2046c1)
enjoy-binbin
added a commit
to enjoy-binbin/redis
that referenced
this pull request
Sep 23, 2023
…d 7.2 In redis#10536, we introduced the assert, some older versions of servers (like 7.0) doesn't gossip shard_id, so we will not add the node to cluster->shards, and node->shard_id is filled in randomly and may not be found here. It causes that if we add a 7.2 node to a 7.0 cluster and allocate slots to the 7.2 node, the 7.2 node will crash when it hits this assert. Somehow like redis#12538. In this PR, we remove the assert and replace it with plain if. Fixes redis#12603.
madolson
pushed a commit
that referenced
this pull request
Oct 12, 2023
…d 7.2 (#12604) In #10536, we introduced the assert, some older versions of servers (like 7.0) doesn't gossip shard_id, so we will not add the node to cluster->shards, and node->shard_id is filled in randomly and may not be found here. It causes that if we add a 7.2 node to a 7.0 cluster and allocate slots to the 7.2 node, the 7.2 node will crash when it hits this assert. Somehow like #12538. In this PR, we remove the assert and replace it with an unconditional removal.
oranagra
pushed a commit
that referenced
this pull request
Oct 18, 2023
…d 7.2 (#12604) In #10536, we introduced the assert, some older versions of servers (like 7.0) doesn't gossip shard_id, so we will not add the node to cluster->shards, and node->shard_id is filled in randomly and may not be found here. It causes that if we add a 7.2 node to a 7.0 cluster and allocate slots to the 7.2 node, the 7.2 node will crash when it hits this assert. Somehow like #12538. In this PR, we remove the assert and replace it with an unconditional removal. (cherry picked from commit e5ef161)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello. While testing upgrade on sharded cluster from 7.0 to 7.2 we got the segmentation fault in updateShardId.
It seems that 7.0 nodes have no shard id in ping extensions.
Backtrace with gdb:
This should fix issue #12507