You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned in #191 and #192, setting up the Spark cluster and configuring the migrator to correctly utilize the Spark resources is manual and tedious.
How much of this could be automated? Ideally, users would only supply the table size and the throughput supported by the source and target tables, and they should get a Spark cluster correctly sized to transfer the data as efficiently as the source and target databases support.
Some ideas to explore:
Provide such input as parameters of the Ansible playbook and automatically configure the corresponding Spark resources
Publish a tool (e.g. using Pulumi or Terraform) that automatically provisions a cloud-based cluster (e.g. using AWS EC2), ready to run the migrator.
The text was updated successfully, but these errors were encountered:
Hello,
Usually the terms the user wants to deal with are "here are my items, average item size", then a migration target. I usually discount the migration target. Basically, if we can be clear about converting items and time into the throughput we want, then that works also.
... by migration target, I mean duration. "I want the migration to complete in 10 hours", then compute that into throughput to indicate what the source and target would need to sustain.
As mentioned in #191 and #192, setting up the Spark cluster and configuring the migrator to correctly utilize the Spark resources is manual and tedious.
How much of this could be automated? Ideally, users would only supply the table size and the throughput supported by the source and target tables, and they should get a Spark cluster correctly sized to transfer the data as efficiently as the source and target databases support.
Some ideas to explore:
The text was updated successfully, but these errors were encountered: