-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad has the wrong unique.storage.bytesfree
after a restart
#14871
Comments
Hi @TimoWilken! Thanks to your clear instructions, I was able to reproduce this pretty easily. As it turns out, this is a duplicate of #6172. I'm going to rename that issue to clarify the problem a bit and then close this one out. In short, there's an unfortunate interaction here where the fingerprint for storage is a StaticFingerprinter. That means it runs once at startup and then the scheduler has to account for the ephermeral disk is knows about. So if we check the storage on our node:
Then create a job with: ephemeral_disk {
size = 5000
} We'll exec into that allocation to create a large file:
But we'll see the free storage isn't changed:
Then we restart. At that point we see the amount of storage used is less the amount we wrote to disk (500793344 bytes or 477.59MB). Note this isn't the same as the reserved amount, which would be 10x that.
Letting the user set this value is a nice hack and might help out with the situation where the storage fingerprinter can't read the correct value for some exotic storage configuration (I can't think of what that might be). I suspect the right behavior here is to change the storage fingerprinter. Currently it runs |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
:Nomad v1.3.3 (428b2cd8014c48ee9eae23f02712b7219da16d30)
Operating system and Environment details
CentOS 7; Nomad installed from Hashicorp Stable RPM repo and run via the bundled systemd service.
Issue
When Nomad is restarted, it resets its
unique.storage.bytesfree
property to however much disk space is available at that moment. However, if allocations are currently running, the disk space they use is counted twice -- once in the decrease inunique.storage.bytesfree
and again by Nomad when deciding whether there is enough space to place another alloc on the same host.Would it be possible to specify the value of the
unique.storage.bytesfree
metric manually (e.g. aclient.disk_total_free_bytes
setting in/etc/nomad.d/nomad.hcl
), analogous to the way we can specify the amount of memory (client.memory_total_mb
) or CPU (client.cpu_total_compute
) that Nomad should assume is present on the host? Or could Nomad allocate based onunique.storage.bytestotal
instead, and let us specify an amount of disk space to reserve for the system, as is done with CPU and memory inclient.reserved.*
?Reproduction steps
ephemeral_disk
(e.g. 50G) and just creates a 50G file in its task directory and sleeps forever.Expected Result
The second job should run in parallel to the first (as notionally 60G should still be available to allocate -- 110G minus the 50G taken by the first job).
Actual Result
unique.storage.bytesfree
goes down to 60G after the restart, and Nomad thinks it only has 10G of disk space left to allocate on the host (which is 60G minus 50G "taken" by the first, running job).The text was updated successfully, but these errors were encountered: