-
Notifications
You must be signed in to change notification settings - Fork 31
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve the failover of galera service
When a galera node is in the process of shutting down (e.g. during a rolling restart caused by a minor update), the node is unable to serve SQL queries, however it is still connected to clients. This confuses clients who get unexpected SQL status [1] and prevent them from retrying their queries, causing unexpected errors down the road. Improve the pod stop pre-hook to failover the active endpoint to another pod prior to shutting down the galera server, and kill connected clients to force them to reconnect to the new active endpoint. At this stage, the galera server can be safely shutdown as no client will see its WSREP state update. Also update the failover script: 1) when no endpoint is available, ensure no traffic is going through any pod. 2) do not trigger a endpoint failover as long as the current endpoint targets a galera node that is still part of the primary partition (i.e. it is still able to serve traffic). [1] 'WSREP has not yet prepared node for application use' Jira: OSPRH-11488
- Loading branch information
Showing
2 changed files
with
102 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters