We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dstack
Create a project without cloud backends.
The same can be reproduced with cloud fleets, see below.
Get an on-prem fleet with one instance and another fleet with two instances.
> dstack fleet FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED on-prem-2 0 ssh (remote) 2xCPU, 1GB, 35.2GB (disk) $0.0 idle 1 hour ago 1 ssh (remote) 2xCPU, 1GB, 35.2GB (disk) $0.0 idle 1 hour ago on-prem-1 0 ssh (remote) 2xCPU, 1GB, 35.1GB (disk) $0.0 idle 7 mins ago
Try running a task with two nodes or a service with two replicas.
type: service replicas: 2 port: 12345 commands: - sleep infinity resources: memory: 0.5GB.. disk: 10GB..
dstack may assign the run to the fleet with one instance. The second job will then fail because the fleet does not have enough instances.
> dstack apply # BACKEND REGION INSTANCE RESOURCES SPOT PRICE 1 ssh remote instance 2xCPU, 1GB, 35.2GB (disk) no $0 idle 2 ssh remote instance 2xCPU, 1GB, 35.2GB (disk) no $0 idle 3 ssh remote instance 2xCPU, 1GB, 35.1GB (disk) no $0 idle Submit a new run? [y/n]: y NAME BACKEND INSTANCE RESOURCES RESERVATION PRICE STATUS SUBMITTED ERROR happy-pug-1 failed 22:24 JOB_FAILED replica=0 job=0 ssh (remote) instance 2xCPU, 1GB, 35.1GB (disk) $0.0 terminated 22:24 TERMINATED_BY_SERVER replica=1 job=0 failed 22:24 FAILED_TO_START_DUE_TO_NO_CAPACITY
Sometimes dstack will choose the correct fleet, you may need to re-create one of the fleets a few times until you can reproduce.
dstack chooses the fleet with two instances and both jobs are provisioned successfully.
If there are no fleets with enough capacity, dstack shows no offers and the run fails before submitting the jobs.
0.18.36
The same can be reproduced with cloud fleets using --reuse.
--reuse
> dstack fleet FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED cloud-1 0 aws (eu-north-1) 2xCPU, 8GB, 100.0GB (disk), SPOT $0.029 idle 2 mins ago cloud-2 0 aws (eu-north-1) 4xCPU, 16GB, 100.0GB (disk), SPOT $0.0603 idle 1 min ago 1 aws (eu-north-1) 4xCPU, 16GB, 100.0GB (disk), SPOT $0.0603 idle 1 min ago > dstack apply --reuse # BACKEND REGION INSTANCE RESOURCES SPOT PRICE 1 aws eu-north-1 m5.large 2xCPU, 8GB, 100.0GB (disk) yes $0.029 idle 2 aws eu-north-1 m5.xlarge 4xCPU, 16GB, 100.0GB (disk) yes $0.0603 idle 3 aws eu-north-1 m5.xlarge 4xCPU, 16GB, 100.0GB (disk) yes $0.0603 idle Submit a new run? [y/n]: y NAME BACKEND INSTANCE RESOURCES RESERVATION PRICE STATUS SUBMITTED ERROR fuzzy-fish-1 failed 22:58 JOB_FAILED replica=0 job=0 aws (eu-north-1) m5.large 2xCPU, 8GB, 100.0GB (disk), $0.029 terminated 22:58 TERMINATED_BY_SERVER SPOT replica=1 job=0 failed 22:58 FAILED_TO_START_DUE_TO_NO_CAPACITY
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Steps to reproduce
Create a project without cloud backends.
Get an on-prem fleet with one instance and another fleet with two instances.
Try running a task with two nodes or a service with two replicas.
Actual behaviour
dstack
may assign the run to the fleet with one instance. The second job will then fail because the fleet does not have enough instances.Sometimes
dstack
will choose the correct fleet, you may need to re-create one of the fleets a few times until you can reproduce.Expected behaviour
dstack
chooses the fleet with two instances and both jobs are provisioned successfully.If there are no fleets with enough capacity,
dstack
shows no offers and the run fails before submitting the jobs.dstack version
0.18.36
Server logs
Additional information
The same can be reproduced with cloud fleets using
--reuse
.The text was updated successfully, but these errors were encountered: