Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: routing rules to mirror traffic during MoveTables #13772

Closed
4 of 7 tasks
maxenglander opened this issue Aug 13, 2023 · 6 comments · Fixed by #16879
Closed
4 of 7 tasks

RFC: routing rules to mirror traffic during MoveTables #13772

maxenglander opened this issue Aug 13, 2023 · 6 comments · Fixed by #16879

Comments

@maxenglander
Copy link
Collaborator

maxenglander commented Aug 13, 2023

Motivation

MoveTables facilitates moving workloads from non-Vitess MySQL into Vitess. Typical issues users may encounter during SwitchTraffic:

  • Queries fail due to unsupported MySQL syntax in Vitess or due to planner/executor bugs.
  • Performance degrades, due to inadequate physical resources or a cold buffer pool.
  • Unplanned changes in schema semantics, e.g. when a unique key constraint no longer enforces global uniqueness when migrating from an unsharded to a sharded keyspace.

Users should mitigate these challenges through best practices such as:

  • Running queries through vtexplain to discover syntax and V/Schema design issues.
  • Test production-like workloads against a production-grade Vitess cluster.
  • Using SwitchTraffic --tablet_types RDONLY,REPLICA to validate SELECT queries, and gradually ramping queries to @replica or @rdonly.

However, these best practices come with their own challenges:

  • Users may choose to capture production queries with a few hours of general log activity off-peak, which may mean the log does not capture all relevant kinds of queries.
  • vtexplain does a good job of reporting whether a query can be parsed and planned by Vitess, but it does not perfectly replicate the VTGate planning phase, and does not emulate the query execution phase at all.
  • Spinning up a production-grade test cluster can be expensive, especially when there are many shards involved. Emulating the behavior of a busy application against the test cluster can be expensive and difficult as well.
  • Configuring an application to use @replica may be challenging, depending on the nature of the workload or limitations in application libraries.

Proposal

To complement these best practices, this RFC proposes a new type of routing rule to mirror some or all traffic to the target keyspace in the course of a MoveTables migration. This feature will allow users to easily test how a new, production-grade Vitess cluster will handle a production workload prior to running SwitchTraffic.

Rule Definitions

Define a new topology record MirrorRules to support mirror rules that look like this:

{
  "rules": [
    {
      "fromKeyspace": "ks1@rdonly",
      "toKeyspace": "ks2",
      "percent": 50.0
    }
  ]
}

VSchema

vindexes.BuildKeyspace will be updated to take new proto definitions into account. vindexes.(*VSchema).FindMirrorTables will provide other packages visibility into mirror rules.

Validate Mirror Rules

Validate constraints on mirror rules usage, e.g.:

+       if sourceMirror.Percent <= 0 || sourceMirror.Percent > 100 {
+               newRule.Error = vterrors.Errorf(
+                       vtrpcpb.Code_INVALID_ARGUMENT,
+                       "mirror rule mirror percentage must be between 0 (exclusive) and 100",
+               )
+               return newRule
+       }

Find Mirror Tables

Other parts of the code base will need to be able to introspect tables which have mirror rules on them.

+// FindMirroredTables finds tables that mirror an authoritative table.
+func (vschema *VSchema) FindMirroredTables(
+        keyspace, 
+        tablename string,
+        tabletType topodatapb.TabletType,
+) (map[*Table]*Mirror, error) {

A table will be considered "mirrored" if there is a mirror rule defined for keyspace, and the target keyspace has a matching table name.

Query Planning

Modify planbuilder to produce engine.Mirror primitives when a SELECT query is planned, and the tables that are touched are all in keyspaces with mirror rules to other keyspaces.

func createInstructionFor(ctx context.Context, query string, stmt sqlparser.Statement, reservedVars *sqlparser.ReservedVars, vschema plancontext.VSchema, enableOnlineDDL, enableDirectDDL bool) (*planResult, error) {
	switch stmt := stmt.(type) {
-	case *sqlparser.Select, *sqlparser.Insert, *sqlparser.Update, *sqlparser.Delete:
+	case *sqlparser.Select:
+		return buildMirroredRoutePlan(ctx, query, stmt, reservedVars, vschema, enableOnlineDDL, enableDirectDDL)
+	case *sqlparser.Insert, *sqlparser.Update, *sqlparser.Delete:
		configuredPlanner, err := getConfiguredPlanner(vschema, stmt, query)

Query Execution

Query planning will produce a Mirror primitive that executes the main and mirrored queries in parallel, returning only the results and the error from the main query.

Some additional goals, not demonstrated in the code sample below:

  • Use a hard-coded or user-defined timeout to limit additional latency that mirrored queries may add to the overall execution.
  • Use a hard-coded or user-defined limit on concurrent mirror queries. Simply throw away mirror queries when this limit is reached. Will effectively work the same way as a pool.
  • Errors from main queries can be compared with errors from mirrored queries. Differences can be logged and instrumented.
  • Minimize memory overhead by either setting the gRPC max receive size to a small value for mirrored queries, or instructing the remote VTTablet to return an empty result for mirrored queries.
func (m *Mirror) TryExecute(
       ctx context.Context,
       vcursor VCursor,
       bindVars map[string]*querypb.BindVariable,
       wantfields bool,
) (*sqltypes.Result, error) {
       var wg sync.WaitGroup
       defer wg.Wait()

       for _, target := range m.Targets {
               if !target.Accept() {
                       continue
               }

               wg.Add(1)
               go func(target Primitive, vcursor VCursor) {
                       defer wg.Done()
                       _, _ = target.TryExecute(ctx, vcursor, bindVars, wantfields)
               }(target, vcursor.CloneForMirroring(ctx))
       }

       return m.Primitive.TryExecute(ctx, vcursor, bindVars, wantfields)
}

func (m *PercentMirrorTarget) Accept() bool {
       return m.Percent > (rand.Float32() * 100.0)
}

MoveTables

Introduce a new MoveTables command: MirrorTraffic.

  • MirrorTraffic --percent=1 --tablet_types=RDONLY,REPLICA will add mirror rules reflecting 1% of traffic from the source keyspace to the target keyspace.
  • Unlike SwitchTraffic, there will be no reverse option.
  • In the initial implementation, there will be no option to filter rules to specific --tables; all tables in the workflow will be mirrored. Users are free to manually add/remove/modify rules to achieve different results.
  • --remove will remove all mirror rules.
func commandMirrorTraffic(cmd *cobra.Command, args []string) error {
       format, err := GetOutputFormat(cmd)
       if err != nil {
               return err
       }

       cli.FinishedParsing(cmd)

       req := &vtctldatapb.WorkflowMirrorTrafficRequest{
               Keyspace:    BaseOptions.TargetKeyspace,
               Workflow:    BaseOptions.Workflow,
               TabletTypes: MirrorTrafficOptions.TabletTypes,
               Percent:     MirrorTrafficOptions.Percent,
       }

Monitoring

Existing metrics

Mirrored queries will appear in VTTablet-level metrics of the keyspace(s) where they are sent. Users will be able to compare the query rates, error rates, and latencies of the primary and mirrored keyspace(s).

I don't see a compelling reason to add additional metrics, labels, or label values at the VTTablet to indicate which queries were mirrored.

Differences between main and mirrored queries

I think the most useful thing to will be for VTGate to instrument differences between main queries and mirrored queries. For example if mirrored queries take 20% longer than their main counterpart. Similarly, if the main query does not return an error, but the mirrored queries do.

In terms of how this could be reported, it wouldn't be feasible to instrument this in metrics on a per-query-pattern basis. We can log differences, but that could potentially result in a lot of logging for a very busy application.

What I suggest doing is a cache with a user-definable max size, which maps query patterns to stats that can be fetched on demand through :15000 or logged periodically (e.g. every T minutes or every N queries) to stderr or a file.

Also

Other things that could be useful to monitor and/or log:

  • Log (maybe just debug) when mirror operators are discarded from the tree because they exceeded the safety limit.
  • Count of mirror query executions.
  • Count of mirror primitives timed out or discarded because the global mirror concurrency limit was exceeded.

DML

I think there is use case for mirroring DML. Some applications are write-heavy, and for those applications it would be useful to test the throughput and latency of queries through Vitess. Additionally, mirroring DML may surface subtle issues in V/Schema design, e.g. where a main query to an unsharded source keyspace returns a duplicate key error, but the mirrored query to the target keyspace do not because a unique key is not enforced globally.

However, in the context of MoveTables, it does not make sense to commit DML to the target keyspace, since it would break VReplication (duplicate keys) or corrupt data. One way DML could be mirrored without wreaking havoc would be to wrap every DML in a transaction, and to ensure that every transaction is rolled back and never committed. Even then, there would be the strong likelihood of lock contention which would affect the performance of the mirrored DML.

Given the risk and uncertainty around mirrored DML, I suggest that an initial implementation only mirror Route (= SELECT) operators. A bug in mirrored DML could cause a lot of damage, so I think it is wise to defer any attempt at this until an initial, SELECT-only implementation has stabilized through production trials.

Performance

Performance considerations have been mentioned in various sections above, and are repeated here:

  • Mirroring will not be used and will not impact performance when no mirror rules are defined.
  • Enforce at most one mirror rule per table. Allow each table to mirror to at most one table.
  • To avoid a mirroring explosion, enforce a hard-coded, global limit on the number of mirror operators that can be present in the final operator tree. Set this to 1 in the initial version of this feature, and relax if needed going forward.
  • Enforce a global, hard-coded or user-defined timeout on mirrored queries, possibly expressed as "time to continue executing after the main query has finished".
  • Enforce a global, hard-coded or user-defined concurrency limit on mirrored queries.
  • Limit impact on memory usage by either setting gRPC max receive size to a small value for mirrored RPC calls, or by instructing VTTablet's to send an empty result for mirrored queries.

Alternatives

Some alternatives to parts of the proposal above.

Accommodate mirror rules within routing rules

Originally, the RFC proposed that we accommodate mirror rules within routing rules. After discussion, we agreed not to go with this approach, because it restricts the shape that mirror rule definitions can take, and risks breaking or complicating routing rule processing. Here is the content of the original proposal:

 // MirrorRule specifies a routing rule.
 message MirrorRule {
+  enum Action {
+    ACTION_UNSPECIFIED = 0;
+    ACTION_REDIRECT = 1;
+    ACTION_MIRROR = 2;
+  }
+
+  message Mirror {
+    float percent = 1;
+  }
+
   string from_keyspace = 1;
   string to_keyspace = 2;
   float percent = 3;
 }

Pros:

  • Can re-use existing routing rules code paths throughout the code base for handling mirror rules.
  • Nice to combine two things (routing rules / mirror rules) that are conceptually very similar.
  • In cases where it makes sense to update routing rules and mirror rules at the same time, can accomplish with one topology call.

Cons:

  • Changing the structure of existing routing rules risks of subtly breaking things that depend on routing rules.
  • For example: current usage makes assumptions about the content of those rules, e.g. that there is only one rule for each fromTable, an assumption changed by introducing mirror rules.
  • Imposes limitations on the shape of the mirror rule definitions.
  • Increased risk of hitting the maximum topo request/item size (1.5 MiB for etcd).

Decouple main and mirrored queries

In the proposal above, main and mirror queries execute in parallel, but complete as a group:

func (m *Mirror) TryExecute(
       ctx context.Context, 
       vcursor VCursor,
       bindVars map[string]*querypb.BindVariable,
       wantfields bool,
) (*sqltypes.Result, error) {
       var wg sync.WaitGroup
       defer wg.Wait()

       for _, target := range m.Targets {
               if !target.Accept() {
                       continue
               }

               wg.Add(1)
               go func(target Primitive, vcursor VCursor) {
                       defer wg.Done()
                       _, _ = target.TryExecute(ctx, vcursor, bindVars, wantfields)
               }(target, vcursor.CloneForMirroring(ctx))
       }

       return m.Primitive.TryExecute(ctx, vcursor, bindVars, wantfields)
}

An alternate approach would be to decouple the main and mirrored queries, so that a reply can be sent back to the caller as soon as the main query finishes. Mirrored queries can continue executing afterwards, up to a timeout or global concurrency limit. A similar approach was used to implement replica warming queries.

Pros:

  • Greater independence of main queries from mirrored queries.
  • A slow mirrored query will not delay a main query from sending a reply to the caller.
  • A slow main query will not delay mirrored queries from releasing slots in the mirrored query pool.

Cons:

  • More code complexity.
  • Will be more difficult to compare results and errors from main queries with results and errors from mirrored queries.

Implement mirroring at the operator level

The proposal initially recommended implementing mirroring at the operator level. After receiving feedback from the Vitess team, the proposal was amended to mirror at the plan-level. Here is the content of the original proposal:

Incorporate mirror operators into the operator tree. Prune and consolidate these operators in a way that balances the intent of mirroring with other concerns like performance.

  • In the initial implementation, only Route (= SELECT) operators will be mirrored. DML and other types of operators will not be mirrored. See the DML section below for more thoughts on this.
  • Add Mirror operators to the tree in a way that reflects mirror rule definitions.
  • Compact/join Mirror operators into as few combined operators as possible, by following existing logic in the codebase for merging routes, for example.
  • Remove Mirror mirror targets when their keyspace is the same as the keyspace of the main Route.
  • Prune Mirror operators without any mirror targets.
  • Push Horizon below Mirror (and Route) where possible and sensible to do so.
  • Prune Mirror operators from the final tree if the total Mirror count exceeds some small, hard-coded value, to mitigate Mirror operator explosions resulting from planning interactions between Mirror and other operators.
Horizon
└── Mirror
    ├── Route (EqualUnique on user Vindex[shard_index] Values[1] Seen:[m1.id = 1])
    │   └── ApplyJoin (on m1.id = m2.mirror_id columns: )
    │       ├── Table (user.mirror AS m1 WHERE m1.id = 1)
    │       └── Table (user.mirror_fragment AS m2)
    └── PercentMirrorTarget (Percent:50.000000)
        └── Route (Unsharded on main)
            └── ApplyJoin (on m1.id = m2.mirror_id columns: )
                ├── Table (main.mirror AS m1 WHERE m1.id = 1)
                └── Table (main.mirror_fragment AS m2)

Proof-of-concept

#14872

Schedule

If this RFC is approved, I think it will make sense to sequence blocks of development in this order:

  1. PR to implement basic mirroring:
    • Add new mirror rules protos.
    • Implement Mirror operator planning and engine primitive.
  2. Separate PR to introduce MoveTables MirrorTraffic.
  3. Stabilize through production usage.
    • Fix bugs and improve performance.
    • Improve monitoring/logging
  4. PR to implement DML mirroring with auto-rollback.

Approval

Looking forward to feedback, and ultimately hoping for approval, from the following groups/people.

  • @vitessio/query-serving
  • @vitessio/vreplication
  • @deepthi
@rohit-nayak-ps rohit-nayak-ps removed the Needs Triage This issue needs to be correctly labelled and triaged label Aug 13, 2023
@maxenglander maxenglander changed the title Feature Request: routing rules to mirror traffic during MoveTables RFC: routing rules to mirror traffic during MoveTables Dec 31, 2023
@maxenglander maxenglander self-assigned this Dec 31, 2023
@maxenglander maxenglander added Type: RFC Request For Comment and removed Type: Feature labels Dec 31, 2023
@GuptaManan100
Copy link
Member

@maxenglander This is a very well-written RFC! I looked at the query-serving related code in the linked PR too, and that looks good to me too!

@harshit-gangal
Copy link
Member

harshit-gangal commented Jan 9, 2024

@maxenglander The idea is great to have such a feature.
I would like to share some pointers.

  • The issue comes with data consistency as MoveTables works on the concept of eventual consistency.
  • I will rule out mirroring DML query execution as it can lead to potential locking not just for the query execution but also for the VReplication workflows.
    So, only planning can be performed for DML queries.
  • Select queries planning and execution would work but the results could potentially be out of sync and can generate a lot of false negatives.
  • Mirrored Select should be executed outside of the transaction as it can lead to the locking of rows/tables.
  • For the Implementation, I think the Mirror Primitive should be a high-level implementation to execute two similar plans with different keyspaces. This way low-level primitives are not blocked from mirror outputs.

@maxenglander
Copy link
Collaborator Author

Hey @harshit-gangal thanks for the feedback 🤗

  • Understood!
  • Agreed, will make sure we don't mirror DML.
  • I think it's OK for mirrored queries to target an out-of-sync target keyspace, and it will be up to the user to evaluate whether positives are false or true. It would be cool to solve this issue sometime in the future, but I think for now it's OK for mirrored queries to be "best effort".
  • Can you help me understand how mirrored Select would lead to locking? I think that a Select should not block DML unless it includes a ... FOR UPDATE. In that case we could exclude ... FOR UPDATE from mirroring altogether.
  • This part I think I need a bit more help understanding. Can you add a bit more detail to help me understand what you have in mind? E.g. how it is different from what I have in the demo PR.

@maxenglander
Copy link
Collaborator Author

In discussions with @rohit-nayak-ps, we agreed that if we go forward with this it should use a new topology record type, and not try to fit this into the routing rules record type. This will give us more flexibility in how we define the record type, and reduce the likelihood that we complicate or break the routing rules implementation. Updated RFC to reflect.

@maxenglander
Copy link
Collaborator Author

maxenglander commented Jan 30, 2024

Met with members of the Vitess team yesterday to gather feedback on the proposal and the demo implementation.

Some notes:

  • @mattlord pointed out that instrumenting differences in latency in a meaningful way is not going to be easy.
    • On an individual query basis, differences in latency are not meaningful, for example the CPU and network can introduce random latency differences.
    • Aggregating on a plan (SELECT, UPDATE) basis won't be meaningful either, since it won't tell you which specific queries are introducing latency differences.
    • Reporting on a query digest basis gets closer to something useful, but runs the risk of inflating memory usage.
    • Overall: it may make sense to leave latency stats out of an initial implementation, and tackle that as a follow-on PR or RFC.
  • @deepthi pointed out that at low values of percent, you may miss mirroring some important queries. @rohit-nayak-ps suggested this can be tackled by running every query pattern at least once, hashing it, and then applying percent dice rolls on subsequent queries that match that pattern.
  • The @vitessio/query-serving team said that the demo implementation should take a different approach to query planning.
    • The current approach wraps Route operators in Mirror operators.
    • We only care about overall differences in performance, not differences at the individual Route level.
    • So: rather than insert mirror primitives into the main plan, we should run a completely separate planbuilder.PlanQuery, which produces a query plan to mirrored keyspaces, and then execute that in parallel to the main plan.

Next steps: update current demo implementation to use alternative query planning/execution approach, and then loop back for another round of feedback.

@maxenglander
Copy link
Collaborator Author

I will no longer be able to work on this. If someone else wants to take this over feel free to re-open. Thanks everyone for the feedback and discussions on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
4 participants