need Stream Slicing in source-mysql to raise updateTable other than waiting too long unsafely for copy tmp files #47321
Unanswered
amelia-ay
asked this question in
Connector Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Connector Name
source-mysql
Connector Version
3.73
What step the error happened?
During the sync
Relevant information
When we synchronize data from MySQL to Databricks using airbyte-helm, the synchronized table has TB-sized records. We noticed that the data is always continuously written to the tmp file, and then merged to the destination table until whole stream success.
During load large number of tmp files, it is highly likely to be interrupted, resulting in Sync Partially Succeeded.
However, this does not solve the problem well. The data synchronization digitization still takes several days, and it will also get stuck during the final "merge into...".
If the source can do Stream Slicing, this problem may be solved.
maybe we can slicing the stream by running hours or by records' size.
Also, before this method is implemented, is there any other way to increase the number of action of update table to obtain a similar result?
We tested buildImage: mysql-dev
Relevant log output
replication-orchestrator > Records read: 73790000 (84 GB)
Contribute
Beta Was this translation helpful? Give feedback.
All reactions