You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
After installation, when I use kubectl describe node nodegpu1, it does not show the gpumem related resource. What you expected to happen:
the ' nvidia.com/gpumem: ' show be shown under 'Allocatable' part of the node. How to reproduce it (as minimally and precisely as possible):
according to the installation steps, and then run kubectl describe node nodegpu1.
My current info is:
The design of device plugins is to simplify resource management, ensuring that each instance focuses on managing a single type of device resource. The related interfaces, like Registration and ListAndWatch, enforce this by limiting each device plugin instance to reporting and managing only one resource type.
In our case, to handle GPU resources like cores and memory for scheduling, we register the total allocatable amounts as annotations on the nodes. You can see an annotation like this:
And because we can only report nvidia.com/gpu to the kubelet, this results in the 2 Insufficient nvidia.com/gpumem warning message from the default scheduler in the previous issue #611 .
What happened:
After installation, when I use
kubectl describe node nodegpu1
, it does not show the gpumem related resource.What you expected to happen:
the ' nvidia.com/gpumem: ' show be shown under 'Allocatable' part of the node.
How to reproduce it (as minimally and precisely as possible):
according to the installation steps, and then run
kubectl describe node nodegpu1
.My current info is:
Anything else we need to know?:
nvidia-smi -a
on your host/etc/docker/daemon.json
)sudo journalctl -r -u kubelet
)dmesg
Environment:
docker version
uname -a
node's annotation:
The text was updated successfully, but these errors were encountered: