Skip to content

Commit

Permalink
Take replication lag into account while selecting primary candidate (#…
Browse files Browse the repository at this point in the history
…14634)

Signed-off-by: Manan Gupta <[email protected]>
  • Loading branch information
GuptaManan100 authored Dec 5, 2023
1 parent 6820742 commit 36d8d36
Show file tree
Hide file tree
Showing 16 changed files with 1,231 additions and 960 deletions.
12 changes: 12 additions & 0 deletions changelog/19.0/19.0.0/summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
- [`FOREIGN_KEY_CHECKS` is now a Vitess Aware Variable](#fk-checks-vitess-aware)
- **[Query Compatibility](#query-compatibility)**
- [`SHOW VSCHEMA KEYSPACES` Query](#show-vschema-keyspaces)
- **[Planned Reparent Shard](#planned-reparent-shard)**
- [`--tolerable-replication-lag` Sub-flag](#tolerable-repl-lag)

## <a id="major-changes"/>Major Changes

Expand Down Expand Up @@ -66,3 +68,13 @@ mysql> show vschema keyspaces;
+----------+---------+-------------+---------+
2 rows in set (0.01 sec)
```

### <a id="planned-reparent-shard"/>Planned Reparent Shard

#### <a id="tolerable-repl-lag"/>`--tolerable-replication-lag` Sub-flag

A new sub-flag `--tolerable-replication-lag` has been added to the command `PlannedReparentShard` that allows users to specify the amount of replication lag that is considered acceptable for a tablet to be eligible for promotion when Vitess makes the choice of a new primary.
This feature is opt-in and not specifying this sub-flag makes Vitess ignore the replication lag entirely.

A new flag in VTOrc with the same name has been added to control the behaviour of the PlannedReparentShard calls that VTOrc issues.

19 changes: 11 additions & 8 deletions go/cmd/vtctldclient/command/reparents.go
Original file line number Diff line number Diff line change
Expand Up @@ -183,9 +183,10 @@ func commandInitShardPrimary(cmd *cobra.Command, args []string) error {
}

var plannedReparentShardOptions = struct {
NewPrimaryAliasStr string
AvoidPrimaryAliasStr string
WaitReplicasTimeout time.Duration
NewPrimaryAliasStr string
AvoidPrimaryAliasStr string
WaitReplicasTimeout time.Duration
TolerableReplicationLag time.Duration
}{}

func commandPlannedReparentShard(cmd *cobra.Command, args []string) error {
Expand Down Expand Up @@ -216,11 +217,12 @@ func commandPlannedReparentShard(cmd *cobra.Command, args []string) error {
cli.FinishedParsing(cmd)

resp, err := client.PlannedReparentShard(commandCtx, &vtctldatapb.PlannedReparentShardRequest{
Keyspace: keyspace,
Shard: shard,
NewPrimary: newPrimaryAlias,
AvoidPrimary: avoidPrimaryAlias,
WaitReplicasTimeout: protoutil.DurationToProto(plannedReparentShardOptions.WaitReplicasTimeout),
Keyspace: keyspace,
Shard: shard,
NewPrimary: newPrimaryAlias,
AvoidPrimary: avoidPrimaryAlias,
WaitReplicasTimeout: protoutil.DurationToProto(plannedReparentShardOptions.WaitReplicasTimeout),
TolerableReplicationLag: protoutil.DurationToProto(plannedReparentShardOptions.TolerableReplicationLag),
})
if err != nil {
return err
Expand Down Expand Up @@ -292,6 +294,7 @@ func init() {
Root.AddCommand(InitShardPrimary)

PlannedReparentShard.Flags().DurationVar(&plannedReparentShardOptions.WaitReplicasTimeout, "wait-replicas-timeout", topo.RemoteOperationTimeout, "Time to wait for replicas to catch up on replication both before and after reparenting.")
PlannedReparentShard.Flags().DurationVar(&plannedReparentShardOptions.TolerableReplicationLag, "tolerable-replication-lag", 0, "Amount of replication lag that is considered acceptable for a tablet to be eligible for promotion when Vitess makes the choice of a new primary.")
PlannedReparentShard.Flags().StringVar(&plannedReparentShardOptions.NewPrimaryAliasStr, "new-primary", "", "Alias of a tablet that should be the new primary.")
PlannedReparentShard.Flags().StringVar(&plannedReparentShardOptions.AvoidPrimaryAliasStr, "avoid-primary", "", "Alias of a tablet that should not be the primary; i.e. \"reparent to any other tablet if this one is the primary\".")
Root.AddCommand(PlannedReparentShard)
Expand Down
1 change: 1 addition & 0 deletions go/flags/endtoend/vtorc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ Flags:
--tablet_manager_grpc_key string the key to use to connect
--tablet_manager_grpc_server_name string the server name to use to validate server certificate
--tablet_manager_protocol string Protocol to use to make tabletmanager RPCs to vttablets. (default "grpc")
--tolerable-replication-lag duration Amount of replication lag that is considered acceptable for a tablet to be eligible for promotion when Vitess makes the choice of a new primary in PRS
--topo-information-refresh-duration duration Timer duration on which VTOrc refreshes the keyspace and vttablet records from the topology server (default 15s)
--topo_consul_lock_delay duration LockDelay for consul session. (default 15s)
--topo_consul_lock_session_checks string List of checks for consul session. (default "serfHealth")
Expand Down
Loading

0 comments on commit 36d8d36

Please sign in to comment.