Denormalization consistency #2387
chubei
started this conversation in
Feature Requests
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Problem
Imagine the following simple scenario:
Replicate two tables from the same Oracle database to Aerospike with denormalization.
Say the two tables are CUSTOMERS and TRANSACTIONS, and TRANSACTIONS has a foreign key that points to a row in CUSTOMERS table. This is a typical one to many relationship.
In this case, for the denormalization to work, customers must be created in the sink table before a relevant transaction is processed by the sink. This is not guaranteed by the currently pipeline. The problem is two fold:
Solution
Snapshotting
The connector should respect table order, and the pipeline should ensure that
Source
s are passed to connectors in the declaration order. Then we request the user to declare depended source before depending source.Replication
Currently operations from a single source are identified by the table it belongs to, and are sent to different edges along the DAG, effectively becoming unordered. This "distribution" is implemented via the "port" concept.
Here we propose to remove "port", and any node (processor or sink) that's interested in any table of a source receives all operations from the source. To initialize the processor or the sink, the user must specify which table(s) the processor or sink should process.
Beta Was this translation helpful? Give feedback.
All reactions