Skip to content

Converge divergent shard-id persisted in nodes.conf to master's shard…#3229

Merged
zuiderkwast merged 1 commit intovalkey-io:7.2from
roshkhatri:backport-divergent-shardid-7.2
Feb 19, 2026
Merged

Converge divergent shard-id persisted in nodes.conf to master's shard…#3229
zuiderkwast merged 1 commit intovalkey-io:7.2from
roshkhatri:backport-divergent-shardid-7.2

Conversation

@roshkhatri
Copy link
Member

Backporting #2174 to 7.2 branch

Handle divergent shard-id across master and replica from nodes.conf and reconcile all the nodes in the shard to the master node's shard-id.

Fixes #2171

(cherry picked from commit 4a3e713)

@roshkhatri roshkhatri force-pushed the backport-divergent-shardid-7.2 branch from 06c395c to f0a5d97 Compare February 18, 2026 22:05
… id (valkey-io#2174)

Handle divergent shard-id across master and replica from nodes.conf and
reconcile all the nodes in the shard to the master node's shard-id.

Fixes valkey-io#2171

(cherry picked from commit 4a3e713)

Signed-off-by: Harkrishn Patro <[email protected]>
Signed-off-by: Roshan Khatri <[email protected]>
Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roshkhatri known 7.2 CI failures? also why not do it in #3139 itself?

if (clusterGetMaxEpoch() > server.cluster->currentEpoch) {
server.cluster->currentEpoch = clusterGetMaxEpoch();
}
if (dirty || shard_id_reconciled) clusterSaveConfigOrDie(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I missed, why did we add this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roshkhatri Do you want to answer this before we merge?

In unstable, I can't see that we save cluster config after loading it even if we have reconciled the shard ids.

Copy link
Member Author

@roshkhatri roshkhatri Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we needed those to get the same behavior, the configs did not save to the files after the shardid we reconciled in-memory
The tests were failing. I will look into unstable why we dont need these there

Copy link
Member Author

@roshkhatri roshkhatri Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in clusterUpdateMyselfFlags in clusterInit,
on unstable:
(myself->flags != oldflags) is always true as we are adding new flags so the configs are saved automatically before sleep
on 7.2:
we dont add flags and (myself->flags != oldflags) is not true so we dont add clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG | CLUSTER_TODO_UPDATE_STATE);

Maybe we can add a save config before sleep in clusterinit by default on 7.2 and we wont need this

Copy link
Contributor

@zuiderkwast zuiderkwast Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That would affect other behavior too? Maybe best to do the minimum impact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to minimum blast radius

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this as-is then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah minimum is good

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zuiderkwast zuiderkwast merged commit a3762b9 into valkey-io:7.2 Feb 19, 2026
47 of 56 checks passed
@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (4207301) to head (4f8df34).
⚠️ Report is 1 commits behind head on 7.2.

Additional details and impacted files
@@     Coverage Diff     @@
##   7.2   #3229   +/-   ##
===========================
===========================
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@roshkhatri roshkhatri deleted the backport-divergent-shardid-7.2 branch February 19, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants