You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
when a k8s-manager does not have a GPU Omnia will not deploy the k8s-device-plugin. We need to inspect the entire inventory for GPUs before deploying the plugin. I suggest we also taint or label any compute nodes that do not have GPUs because nvidia's plugin does not check. The AMD plugin seems to deploy just fine whether there are AMD accelerators or not.
The text was updated successfully, but these errors were encountered:
Identify Nodes without GPUs:
You need a mechanism to determine which compute nodes in your Kubernetes cluster do not have GPUs available. This can be done through manual inspection or automated scripts that query node specifications.
Node Labeling:
Once you identify nodes without GPUs, apply labels to them using kubectl label nodes =.
For example, you can label nodes without GPUs as gpu-enabled=false.
Node Tainting:
Apply taints to nodes without GPUs to repel workloads that require GPUs. Taints prevent non-GPU workloads from being scheduled on these nodes.
Use kubectl taint nodes =: to apply taints.
For instance, you can use a taint like gpu-accelerator=false:NoSchedule.
Configure Workloads:
Ensure that GPU-dependent workloads are configured to tolerate the taints or have node selectors that consider GPU availability.
For example, in the Pod specification, you might add tolerations for the taints applied to nodes without GPUs.
Describe the bug
when a k8s-manager does not have a GPU Omnia will not deploy the
k8s-device-plugin
. We need to inspect the entire inventory for GPUs before deploying the plugin. I suggest we also taint or label any compute nodes that do not have GPUs because nvidia's plugin does not check. The AMD plugin seems to deploy just fine whether there are AMD accelerators or not.The text was updated successfully, but these errors were encountered: