Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add execution concurrency #5659

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
6 changes: 5 additions & 1 deletion rfc/system/RFC-5659-execution-concurrency.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ enum ConcurrencyPolicy {

// fail the CreateExecution request and do not permit the execution to start
ABORT = 2;

// terminate the oldest execution when the concurrency limit is reached and immediately begin proceeding with the new execution
REPLACE = 3;
}

message LaunchPlanSpec {
Expand Down Expand Up @@ -113,7 +116,7 @@ If we wanted further parallelization here, we could introduce a worker pool rath

We should consider adding an index to the executions table to include
- launch_plan_id
- phase
- phase==PENDING only (in order to safeguard for well-populated flyteadmin instances with lots of completed, historical executions)
- created_at

##### Concurrency across launch plan versions
Expand Down Expand Up @@ -195,6 +198,7 @@ WHERE ( launch_plan_named_entity_id, created_at ) IN (SELECT launch_plan_named_
GROUP BY launch_plan_named_entity_id);
```

Note, in this proposal, registering a new version of the launch plan and setting it to active will determine the concurrency policy across all launch plan versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️


#### Prior Art
The flyteadmin native scheduler (https://github.com/flyteorg/flyte/tree/master/flyteadmin/scheduler) already implements a reconciliation loop to catch up on any missed schedules.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly look at the prior art. The number of db queries is minimal. Actually one per cycle.

I would keep all running executions with concurrency_level set in memory and all lps with concurrency_level set in memory (only the concurrency policies)
We should periodically update these and its ok to be eventually consistent

Expand Down
Loading