-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
one-at-a-time batch scheduling #10015
Comments
Hi @pznamensky! This is definitely a tricky one. If it weren't for stopping the old version of the job, would it be ok in your case to run the old job and new job concurrently? Because you might be able to workaround this by using a parameterized job and then dispatching it. Using this example job: job "example" {
datacenters = ["dc1"]
type = "batch"
parameterized {
payload = "optional"
}
group "worker" {
task "worker" {
driver = "docker"
config {
image = "busybox:1"
command = "/bin/sh"
args = ["-c", "echo 'this looks like work'; sleep 300"]
}
resources {
cpu = 255
memory = 128
}
dispatch_payload {
file = "something.txt"
}
}
}
}
I run the job and then dispatch it multiple times:
This results in multiple dispatched children of the job running concurrently:
If I then edit the job and re-run it, it only changes the "parent" job, and I can dispatch yet a third concurrent job:
|
Hi @tgross! |
Ok. Unfortunately there's not a better way to do that as far as I can tell. The |
@tgross is there a way to prevent running multiple allocations? (i am managing the periodic scheduling outside of nomad and would like to dispatch but I want to ensure the only one job of this time is running at any given time) |
No there is not currently, short of taking a lock. |
Nomad version
Nomad v0.12.9 (45c139e)
Issue
We've got a long-running batch job (it works around several days), which we definitely won't like to stop during deploy.
But it looks like there is no straightforward way not to stop currently running batch job after submitting a new version.
So we're trying to work around it with
shutdown_delay
(since it is supported in batch jobs: #7271).And it works ... but not as accurate as we'd like.
The main problem is that
shutdown_delay
doesn't honor task state and block a new version execution despite and old one already finished.For instance:
shutdown_delay = 3d
startedshutdown_delay
didn't over, a new job version has to wait yet another dayWould be perfect if
shutdown_delay
take into account that a batch job was already finished, and start a new job version immediately after finishing a current one.Reproduction steps
JOB_VERSION
to something else):Some closing thoughts:
update {}
stanza instead of workaround withshutdown_delay
- if you think so, please ask me to fill a new issueThe text was updated successfully, but these errors were encountered: