Skip to content
This repository has been archived by the owner on Oct 5, 2020. It is now read-only.

PACKAGES: PAAS - TiDB CDC EPIC #387

Closed
winwisely99 opened this issue Apr 16, 2020 · 2 comments
Closed

PACKAGES: PAAS - TiDB CDC EPIC #387

winwisely99 opened this issue Apr 16, 2020 · 2 comments
Labels
good first issue Good for newcomers

Comments

@winwisely99
Copy link
Contributor

winwisely99 commented Apr 16, 2020

A standard HA SQL database with Change Data Capture is needed.
When a change occurs the CDC pumps an event to NATS.
Minio supports exactly the same pattern, in that a change can publish an event to NATS.

The event can then be pushed to a client, and the client can then update.
This provides real time updates for both online and offline clients.

Architecture Patterns

NATS is used to push updates to client modules via grpc. This is why liftbridge is perhaps the best option since it has a native grpc interface.

https://github.com/liftbridge-io/liftbridge
A user can have many devices that at various states of "caught up" to the latest push events.

A user's new device will need to catch-up from the nothing though. So this means that all state must be stored in tidb. The alternative is a pure Kafka architecture where all events are pushed back that ever happed, but this has some downsides.

CDC pattern:
So a mutation first is stored in tidb and then the resulting event is pushed onto the correct NATS / liftbridge topic so that all of the users other devices also get the mutation pushed to them.
This is basically CDC !

Persistent log pattern:
Because NATS does not store events for ever, you can use the Persistent log pattern also, so that all events are forever in the DB, to allow catchup:
https://github.com/ThreeDotsLabs/watermill/tree/master/_examples/real-world-examples/persistent-event-log

This pattern should work effectively for all modules.

Persistent log pattern is the most simple one.

Design constraints:

  • Try to Keep all of this encapsulated behind a GRPC API, so the clients dont need to have ANY compile time binding to TIDB, NATS or LIFTBridge
    • Liftbridge wraps NATS with GRPC, so thats helpful
    • TIDB is SQL, with Results. There are GRPC APIS that wrap databases but still allow a Users to make custom SQL calls to a DB. THIS IS A STRETCH GOAL for v2 though i think.
  • a Golang CLI and flutter example to exercise the API
    • GRPC-Web with envoy is required to use flutter web with GRPC. You can see our maintemplate for how we do it there and copy it.

Existing code

https://github.com/getcouragenow/packages/blob/master/mod-chat/server/Makefile

  • its using nats and liftbridge
  • Needs TIDB CDC to make it go fast :)
  • once tidb cdc works we can plug into it.

Suggested Phasing:

  • Where to do it ?

  • make files

    • Use the boilerplate make files from boilerplate
  • Get tidb cdc and liftbridge running locally with just the golang binaries using a makefile

  • Inplement a golang cli

    • It should setup a table in tidb
    • then populate with test data
    • then write the golang code to listen to the CDC events and pump them into LiftBridge and back out to the CLI

Next Phase is to get it working with k8

  • Boostrap on cloud and local

    • cloud: k8, running using Helm, CI (github action)
    • local: same k8, but running in minikube. Keep it DRY please.
    • dockers to be all local please.
  • Dashboard

  • Bootstrap NATS Cluster

    • For HA, I would recommend running a NATS cluster rather than a single NATS Server instance. The NATS cluster is configured independently of Liftbridge.
    • The dockers from LiftBridge are excellent

CODE links:

https://github.com/pingcap/ticdc

https://github.com/liftbridge-io

SQL ORM that looks good

https://github.com/volatiletech/sqlboiler/blob/master/README.md#supported-databases

It relies on reflection on tables in the dB which is often cleaner

https://github.com/volatiletech/mig
Schema migrations

https://github.com/direnv/direnv?files=1
Please use this so we don't rely on any bashrc etc etc files.
Note: volatiletech/sqlboiler#711

https://github.com/pingcap-incubator/tiup
Installs and sets up all tidb things for you.

  • includes tidb CDC support
@winwisely99
Copy link
Contributor Author

Shows how to setup multi data center k8 with tidb

https://elastisys.com/2018/12/10/geo-distributed-wordpress-with-kubernetes-and-tidb/
Terraform for variables
Simple design
In each DC only need a single server because each replicates globally. Can still have many SAAS autoscaling in each DC
Uses Google global geo load balancer. Easy.

Does NOT use SSL certs. Need that

@joe-getcouragenow
Copy link
Contributor

Once this is working we can then us it for ION.
see mod-ion.

An easy way to replace Redis for ion so it can scale globally.
Make a docker and a k8 for it, as some want to still just use docker.

Basically ion is using Redis with the pub sub functionality. It's very easy to just standup tidb CDC and NATS, and you get an event from NATS whenever the data structure changes.

In the ion code base there are only 6 functions that use Redis and they are very simple logic so it should be easy to do.

NATS and TiDb run in cluster mode so you can run in many data centers and even if 60% of your data centers go down ( for any reason ) the system just keeps on going ) and when the data centers come back up the database catches up automatically, and also won't play stale events that are already pushed.

@joeky888 joeky888 mentioned this issue May 4, 2020
@gollariel gollariel self-assigned this May 5, 2020
@joe-getcouragenow joe-getcouragenow changed the title PACKAGES: PASS - tidb CDC PACKAGES: PASS - tidb CDC EPIC May 5, 2020
@joe-getcouragenow joe-getcouragenow changed the title PACKAGES: PASS - tidb CDC EPIC PACKAGES: PAAS - TiDB CDC EPIC May 5, 2020
@gollariel gollariel removed their assignment May 6, 2020
@joeky888 joeky888 removed their assignment May 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants