You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nomad 1.1 adding Memory Oversubscription which allows tasks to set a soft and hard limit for their memory usage. When a system's memory is exhausted and a process needs to be oom killed, the higher a process is over its soft limit, the more likely it is to be killed.
However, this logic does not take into account the priorities of the application being run. Just because a task is using more memory does not make it "less valuable" to be running.
Proposal: resources.oom_score_adj
Add an oom_score_adj field to resources that only accepts positive values. A task can mark itself as more likely to be OOM killed, but not less likely.
Abandoned Idea: Use Priority
It's tempting to make job.priority implicitly impact oom_score_adj since it is already used by preemption to stop lower priority jobs.
However, this impacts all tasks for a job. You may want a flaky log shipping sidecar to get killed (and restarted) before anything else on a system.
So while implicitly using priority might be valuable in the future, allowing users to explicitly demote some tasks is a necessary first step.
The text was updated successfully, but these errors were encountered:
With updates to raw_exec, exec2 and docker, I consider this issue closed for now. Should we want to revisit putting this in the resources block, we have #23259 as a good starting point.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Problem
Nomad 1.1 adding Memory Oversubscription which allows tasks to set a soft and hard limit for their memory usage. When a system's memory is exhausted and a process needs to be oom killed, the higher a process is over its soft limit, the more likely it is to be killed.
However, this logic does not take into account the priorities of the application being run. Just because a task is using more memory does not make it "less valuable" to be running.
Proposal: resources.oom_score_adj
Add an
oom_score_adj
field toresources
that only accepts positive values. A task can mark itself as more likely to be OOM killed, but not less likely.Abandoned Idea: Use Priority
It's tempting to make
job.priority
implicitly impactoom_score_adj
since it is already used by preemption to stop lower priority jobs.However, this impacts all tasks for a job. You may want a flaky log shipping sidecar to get killed (and restarted) before anything else on a system.
So while implicitly using
priority
might be valuable in the future, allowing users to explicitly demote some tasks is a necessary first step.The text was updated successfully, but these errors were encountered: