-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move to cloud composer 2; better handle long-running tasks #36
Conversation
…ensors for long-running tasks
66d1e5c
to
eee4ab6
Compare
No need for rebasing 👍 |
☂️ Python Coverage
Overall Coverage
New Files
Modified Files
|
7d39f07
to
b3d2590
Compare
linkage_dag.py
Outdated
|
||
push_to_gcs = BashOperator( | ||
task_id="push_to_gcs", | ||
bash_command=f'gcloud compute ssh jm3312@{gce_resource_id} --zone {gce_zone} --command "run_ids_script.sh &"', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the run_ids_script.sh
file? I couldn't find this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in utils
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I think you're right and this was a leftover task that wasn't doing anything. Fixed, thanks!!
Add ability to exclude pairs of ids for matching
Now splits the long-running simhash and merged id generation task into multiple tasks, using sensors to reduce issues with lost connections.
Also adds linting and moves bucket to cloud composer 2
Closes #35