Skip to content

Commit 5da77dc

Browse files
jzhou77sbodagala
authored andcommitted
Merge pull request #9814 from sbodagala/main
FdbServer not able to join cluster
1 parent 4808747 commit 5da77dc

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

fdbserver/LeaderElection.actor.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,13 @@ ACTOR Future<Void> tryBecomeLeaderInternal(ServerCoordinators coordinators,
158158
// These coordinators are forwarded to another set. But before we change our own cluster file, we need
159159
// to make sure that a majority of coordinators know that. SOMEDAY: Wait briefly to see if other
160160
// coordinators will tell us they already know, to save communication?
161-
wait(changeLeaderCoordinators(coordinators, leader.get().first.serializedInfo));
161+
// NOTE: If a majority of coordinators (in the current connection string) have failed then we can
162+
// end up waiting here indefinitely. Try to make progress in that scenario by proceeding with the
163+
// connection string that we have received. Not a great solution, but can help in certain scenarios.
164+
choose {
165+
when(wait(changeLeaderCoordinators(coordinators, leader.get().first.serializedInfo))) {}
166+
when(wait(delay(20))) {}
167+
}
162168

163169
if (!hasConnected) {
164170
TraceEvent(SevWarnAlways, "IncorrectClusterFileContentsAtConnection")

0 commit comments

Comments
 (0)