-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[server] Directed leadership transfer CLI and API #17383
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @angrycub. I've left a few comments on how we might be able to refine this.
83094c5
to
8f00c63
Compare
8de8007
to
597e034
Compare
Thank you for this! Was looking for this command and excited to see this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good @angrycub! I think just a few more tweaks and we can land this.
I think this makes the validations a little easier to parse as a reader and eliminates nesting
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Ignored Deployment
|
597e034
to
c5ded91
Compare
c5ded91
to
a557c7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
<Tabs> | ||
<Tab heading="Nomad CLI"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave this for now, but adding a brand new way to demonstrate API examples is the sort of thing that really doesn't belong in a feature PR. We end up with a bunch of one-off decisions that way that aren't applied equally across the API docs. In the future, if you want to do this sort of thing, make a separate PR for it so that we can discuss it independently of critical feature work.
for i := 0; i < retryCount; i++ { | ||
err := s.raft.LeadershipTransferToServer(to.ID, to.Address).Error() | ||
if err == nil { | ||
s.logger.Info("successfully transferred leadership") | ||
return nil | ||
} | ||
|
||
// "cannot transfer leadership to itself" | ||
// Handled at top of function, but reapplied here to prevent retrying if | ||
// it occurs while we are retrying | ||
if err.Error() == "cannot transfer leadership to itself" { | ||
s.logger.Debug("leadership transfer to current leader is a no-op") | ||
return nil | ||
} | ||
|
||
// ErrRaftShutdown: Don't retry if raft is shut down. | ||
if err == raft.ErrRaftShutdown { | ||
return err | ||
} | ||
|
||
// ErrUnsupportedProtocol: Don't retry if the Raft version doesn't | ||
// support leadership transfer since this will never succeed. | ||
if err == raft.ErrUnsupportedProtocol { | ||
return fmt.Errorf("leadership transfer not supported with Raft version lower than 3") | ||
} | ||
|
||
// ErrEnqueueTimeout: This seems to be the valid time to retry. | ||
s.logger.Error("failed to transfer leadership attempt, will retry", | ||
"attempt", i, | ||
"retry_limit", retryCount, | ||
"error", err, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is real nice. 👍
Co-authored-by: Tim Gross <[email protected]>
This PR adds an API endpoint and a CLI command to allow cluster operators to move leadership from the current leader to a specified server node. This can reduce leadership churn during rolling updates by allowing the operator to migrate leadership to the first upgraded node.
Commits
Fixes: #7376