Skip to content

Conversation

@woaishixiaoxiao
Copy link

@woaishixiaoxiao woaishixiaoxiao commented Oct 29, 2021

When I use the LeaderLatch to select leader, there is a double-leader phenomenon.
The timeline is as follows:

  1. The zk cluster switch leader node bescause of zxid overflow. The cluster is unavailable to the outside world
  2. A client(not leader befor zxid overflow) and B client(is leader before zxid overflow) enter the suspend state, B client set its leader status to false
  3. The zk cluster complete the leader node election and the cluster back to normal
  4. A client enter the reconnect state and call the reset function, set its leader status to false.
  5. B client enter the reconnect state, call the reset function. set its leader status to false. Delete its old path.
  6. A client receive preNodeDeleteEvent. Then getChildren from zkServer. Find itself is the smallest number and set itself as a leader.
  7. B client create a new temporary node and then getChildren from zkServer. Find itself not the node with the smallest serial number and listen to the previous node delete event.
  8. A client delete its old path.
  9. B client receive the preNodeDeleteEvent. then getchildren from zkServer. Find itself is the smallest sequence number and then set itself as a leader
  10. A client create a new temporary node and then getChildren from zkServer. Find itself not the node with the smallest serial number and listen to the previous node delete event. but it doesn't set itself as a non-leader state. because of the sixth step operation, A still is leader state now.
  11. now A client and B client are the leader at the same time

@woaishixiaoxiao woaishixiaoxiao changed the title fix the bug of double master when use LeaderLatch to select the leader fix the bug of double leader when use LeaderLatch to select the leader Oct 29, 2021
@eolivelli
Copy link
Contributor

@woaishixiaoxiao thanks for sharing your fix,
do you think that we can add a test case to cover this change ?

@woaishixiaoxiao
Copy link
Author

woaishixiaoxiao commented Oct 29, 2021

@woaishixiaoxiao thanks for sharing your fix, do you think that we can add a test case to cover this change ?

OK. I will try it.
and I find another question related to the leader-selection scenari. When zkServer switch the leader and then returns to normal, all clients will execute state switching: connected->suspend->reconn
Because leaderlatch processing the reconn state will reset leader status, that is mean first set itself leader status false and then delete old temporary sequence Node and create a new one. This operation will cause the business side to perform a leader switch multiple. Some businesses don’t want to see such frequent switchovers happen such as mq. Also this operation will cause nodeDeleteEvent push once from zk server but client execute multiple times nodeDeleteCallback on same path because client saves mutiple watch local(create new path will getchild and listen. and prenodedeleteEvent also will getchild and listen ).
Why don't we replace StandardConnectionStateErrorPolicy with SessionConnectionStateErrorPolicy? The above phenomenon will be avoided

@woaishixiaoxiao woaishixiaoxiao force-pushed the fixbug/LeaderLatch-double-master branch 4 times, most recently from e292454 to 275bbea Compare November 2, 2021 15:06
@woaishixiaoxiao woaishixiaoxiao force-pushed the fixbug/LeaderLatch-double-master branch from 275bbea to 7272b85 Compare November 2, 2021 15:07
@woaishixiaoxiao
Copy link
Author

@woaishixiaoxiao thanks for sharing your fix, do you think that we can add a test case to cover this change ?

HI i have added a unit test. please approval thanks

@tisonkun
Copy link
Member

LGTM. I also think of this change days before. cc @eolivelli @Randgalt can you also give a review?

@tisonkun
Copy link
Member

@tisonkun tisonkun changed the title fix the bug of double leader when use LeaderLatch to select the leader CURATOR-653: fix potential double leader for LeaderLatch Sep 27, 2022
Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after #430 merged, RECONNECT no more causes reset. But it's reasonable to set leadership to false if getChildren finds itself not the leader.

cc @eolivelli @Randgalt

@tisonkun
Copy link
Member

Well. After #430 merged the test added in this patch failed. Need a closer look.

Signed-off-by: tison <[email protected]>
@tisonkun
Copy link
Member

I adjust the test to inject force resets instead of depending on connection loss. Although this means it should be a non-real-world case now, I still agree on setLeadership(false) on checkLeadership find the latch isn't the leader. setLeadership(false) is idempotent.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@eolivelli
Copy link
Contributor

@tisonkun thanks for fixing the test
@woaishixiaoxiao do you agree with @tisonkun 's fix ?

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. I went over the test and it does what it should do. There are only two things that require an actual change (typo in the variable and catch clause in the test implementation hiding any Exception).

About the other nit-picky comments: I'm aware that I'm nit picking here. Hence, see them as proposal rather than requests to change. :-) Nice proposal 👍

Additionally, besides reasoning the code I verified that the test fails with the fix being reverted and succeeds with the fix being included.

volatile CountDownLatch debugResetWaitLatch = null;

@VisibleForTesting
volatile CountDownLatch debugRestWaitBeforeNodeDelete = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
volatile CountDownLatch debugRestWaitBeforeNodeDelete = null;
volatile CountDownLatch debugResetWaitBeforeNodeDeleteLatch = null;

There's a typo in the name. Additionally, we might want to add Latch at the end to reflect the purpose of this member analogously to the other latches.

assertEquals(states.poll(timing.forWaiting().milliseconds(), TimeUnit.MILLISECONDS), "true");
}
}
catch (Exception e){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we hiding thrown exceptions here? Shouldn't we expose it as part of the test run if something went wrong? 🤔 The test would succeed if the Exception is thrown in this block and caught here.

latches.add(latch);
assertEquals(states.poll(timing.forWaiting().milliseconds(), TimeUnit.MILLISECONDS), ConnectionState.CONNECTED.name());
if (i == 0) {
assertEquals(states.poll(timing.forWaiting().milliseconds(), TimeUnit.MILLISECONDS), "true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertEquals(states.poll(timing.forWaiting().milliseconds(), TimeUnit.MILLISECONDS), "true");
assertEquals("The first LeaderLatch instance should acquire leadership.", states.poll(timing.forWaiting().milliseconds(), TimeUnit.MILLISECONDS), "true");

nit: maybe adding a bit more context to this polling here to describe the test case

}
}
timing.forWaiting().sleepABit();
// now latch1 is leader, latch2 is not leader. latch2 listens to the ephemeral node created by latch1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Moving comments into assert messages improves the test output and still works as some kind of comment. This comment could be added to the assertTrue and assertFalse in line 284-285 below describing the currently expected state.


// force latch1 and latch2 reset
latch1.reset();
ForkJoinPool.commonPool().submit(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a comment here on why we're calling latch2.reset() in a separate thread? AFAIU, it's done to not make the test's thread block due to latch2.debugRestWaitBeforeNodeDelete. It might help readers if this is reflected in a comment here. WDYT?

public void stateChanged(CuratorFramework client, ConnectionState newState)
{
if (newState == ConnectionState.CONNECTED) {
states.add(newState.name());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Adding the LeaderLatch ID (we could use for loop i) here might help further down in the test understanding the state of the queue.

Comment on lines +258 to +268
LeaderLatchListener listener = new LeaderLatchListener() {
@Override
public void isLeader() {
states.add("true");
}

@Override
public void notLeader() {
states.add("false");
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the LeaderLatch ID in the event labels (as mentioned above) might help when evaluating the queue later on in the test. But to be fair: doing the asserts on hasLeadership like it's already done below (lines 303-304) serves the same purpose. I just mention it as another idea here. 🤷

Comment on lines +283 to +284
LeaderLatch latch1 = latches.get(0);
LeaderLatch latch2 = latches.get(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LeaderLatch latch1 = latches.get(0);
LeaderLatch latch2 = latches.get(1);
LeaderLatch initialLeaderLatch = latches.get(0);
LeaderLatch initialNonLeaderLatch = latches.get(1);

nit: maybe, making the variable more descriptive to avoid confusion. Especially because we're switching the order here in comparison to what is described in the PR description and the corresponding ticket.

return;
}
}
timing.forWaiting().sleepABit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of waiting here? 🤔

@tisonkun
Copy link
Member

@XComp Thank you very much for the comments. Could you send a pull request based on the current patch? I don't know whether I can directly merge on the forked repository but at least I can submit the patch when you made it :)

@XComp
Copy link
Contributor

XComp commented Oct 10, 2022

@XComp Thank you very much for the comments. Could you send a pull request based on the current patch? I don't know whether I can directly merge on the forked repository but at least I can submit the patch when you made it :)

I created PR #436 but couldn't base it onto this PR. I handled my review comments individually to make it easier to select relevant changes. The last commit is a bigger refactoring of the test. You might want to leave that out if you think that it's too much of a change.

@tisonkun
Copy link
Member

tisonkun commented Oct 10, 2022

but couldn't base it onto this PR

Actually, you can send a pull request with base branch = https://github.com/woaishixiaoxiao/curator/tree/fixbug/LeaderLatch-double-master and head branch = your branch :)

Then the pull request will be opened against https://github.com/woaishixiaoxiao/curator and that should be fine. We use this pull request. Or it's also OK since the author is inactive, you take over the whole lifecycle.

@XComp
Copy link
Contributor

XComp commented Oct 10, 2022

It was a bit trickier but I managed to base the PR based on his branch (his branch didn't show up when I was doing it the way I usually create PRs ¯_(ツ)_/¯). woaishixiaoxiao#1. Although, I'm not sure whether that's of any help because the original author would have to merge it into his branch, wouldn't he?

@tisonkun
Copy link
Member

@XComp OK. After looking into the patch I think #436 is better to proceed (you made a significant diff :)). I'll try to take a look this wek.

@XComp
Copy link
Contributor

XComp commented Oct 10, 2022

Yes, but the biggest part of the diff is the last commit (adaee91): I'm ok with reverting that one if you think it's too much. The changes related to the comments of my review of this PR are included in the commits excluding adaee91 in #436 .

tisonkun added a commit that referenced this pull request Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants