Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: introduce judge_split_prevote #420

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

BusyJay
Copy link
Member

@BusyJay BusyJay commented Mar 6, 2021

During split prevote, a campaign will fail because all nodes think
it will collect enough votes, so after they actually start campaign,
no one votes for the other, the campaign has to fail.

judge_split_prevote solves the problem by adding extra constraint
to split prevote: only vote for nodes that have greater IDs. It's easy
to conclude that it works for peer numbers less than 5. For >=5 nodes,
it's still possible to split again. But it should be enough for most
cases. Because the constraint is only added for split prevote, so even
failure won't lead to worse result.

During split prevote, a campaign will fail because all nodes think
it will collect enough votes, so after they actually start campaign,
no one votes for the other, the campaign has to fail.

`judge_split_prevote` solves the problem by adding extra constraint
to split prevote: only vote for nodes that have greater IDs. It's easy
to conclude that it works for peer numbers not greater 5. For 7 nodes,
it's still possible to split again. But it should be enough for most
cases. Because the constraint is only added for split prevote, so even
failure won't lead to worse result.

Signed-off-by: Jay Lee <[email protected]>
@BusyJay BusyJay force-pushed the introduce-judge-split-vote branch from cf94594 to 158f7f1 Compare March 7, 2021 14:27
@BusyJay
Copy link
Member Author

BusyJay commented Mar 7, 2021

I have tested it with 10k regions of both configuration capacity 3 and 5, both can finish in two election timeout when one TiKV is down.

Signed-off-by: Jay Lee <[email protected]>
@BusyJay BusyJay marked this pull request as ready for review March 7, 2021 15:29
Comment on lines +1474 to +1476
// judge split vote can break symmetry of campaign, but as
// it only happens during split vote, the impact should not
// be significant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we tell if a campaign will end up split vote?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause some raft nodes with lower ID impossible to become the leader even we want to transfer leadership to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the configuration size is an odd number and leader is down, split vote can probably happen If two nodes are in PreCandidate state. judge_split_prevote only works on prevote, transfering leader skips prevote, so they won't have impact on the other.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

judge_split_prevote only works on prevote

Besides transfer leader, a node needs to pass pre-campaign before start the actual campaign, so judge_split_prevote will impact the whole election process (pre-vote should consider enabled as this is when judge_split_prevote work).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so judge_split_prevote will impact the whole election process...

Indeed, it depends on whether nodes are working slowly. But in my tests, when one node is down, and after elections are finished, the leader count on each nodes don't have much differences (I remove balance leader scheduler before shutdown a node). And even it leads to more leaders on some node, it should not be a problem with the help of PD to reach a eventually balance.

@NingLin-P
Copy link
Member

To avoid split prevote, maybe we can record the prevote like the read vote (but not need to persist it) and reject incoming vote request in the same term like what real campaign does.

@BusyJay
Copy link
Member Author

BusyJay commented Mar 8, 2021

...record the prevote like the read vote...

Recording votes won't solve split vote. If pre-campaign works like actual campaign and split vote can happen in actual campaign, then it can also happen in pre-campaign.

@NingLin-P
Copy link
Member

...record the prevote like the read vote...

Recording votes won't solve split vote. If pre-campaign works like actual campaign and split vote can happen in actual campaign, then it can also happen in pre-campaign.

We can't prevent split vote completely, but with random election timeout and vote recording, split vote should happen rarely.

@BusyJay
Copy link
Member Author

BusyJay commented Mar 8, 2021

...split vote should happen rarely.

How can recording vote reduce the probability of split vote? It just make the split happen in early stage, from actual vote to pre vote. The strategy here can solve split completely in configuration size 3, and make it hardly happen in configuration size 5.

@NingLin-P
Copy link
Member

NingLin-P commented Mar 8, 2021

How can recording vote reduce the probability of split vote?

After recording a prevote at future term, a node should not start pre-compaing or prevote for other nodes (at the same future term) as it already prevote at that future term.

@BusyJay
Copy link
Member Author

BusyJay commented Mar 8, 2021

A common situation of split vote in 3 voters situation is that leader is down, and two followers start campaign at the same time. How does the strategy you describe make the campaign succeed in one round of election?

@NingLin-P
Copy link
Member

A common situation of split vote in 3 voters situation is that leader is down, and two followers start campaign at the same time. How does the strategy you describe make the campaign succeed in one round of election?

It can't, forget about it, I have some misunderstanding about the problem previously.

// it only happens during split vote, the impact should not
// be significant.
!self.judge_split_prevote
|| self.state != StateRole::PreCandidate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments for transfer leader?

src/raft.rs Show resolved Hide resolved
Comment on lines +1317 to +1321
// When judge_split_prevote, reject explicitly to make candidate exit PreCandiate early
// so it will vote for other peer later.
if self.judge_split_prevote
&& m.get_msg_type() == MessageType::MsgRequestPreVote
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems Follower and PreCandidate are not different, both of them can (pre)vote to other peers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follower will ignore all PreVoteResponse. For example, A and B split pre-votes, and C can vote for both A and B. If B is chosen, and A doesn't step down to follower, it will still start campaign when C's prevote is received.

@BusyJay
Copy link
Member Author

BusyJay commented Jun 25, 2021

PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants