container with privilege context failed to be schedulered #611

gongysh2004 · 2024-11-14T10:25:11Z

What happened:
container with privilege context failed to be scheduled
What you expected to happen:
should be scheduled
How to reproduce it (as minimally and precisely as possible):
install the hami according to the install steps, then run the following deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpu-test
  labels:
    app: gpu-test
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gpu-test
  template:
    metadata:
      labels:
        app: gpu-test
    spec:
      containers:
      - name: gpu-test
        securityContext:
          privileged: true
        image: ubuntu:18.04
        resources:
          limits:
            nvidia.com/gpu: 2 # requesting 2 vGPUs
            nvidia.com/gpumem: 10240
        command: ["/bin/sh", "-c"]
        args: ["while true; do cat /mnt/data/test.txt; sleep 5; done"]
        volumeMounts:
        - mountPath: "/mnt/data"
          name: data-volume
      volumes:
      - name: data-volume
        hostPath:
          path: /opt/data
          type: Directory

Anything else we need to know?:

The output of nvidia-smi -a on your host
Your docker or containerd configuration file (e.g: /etc/docker/daemon.json)
The hami-device-plugin container logs
The hami-scheduler container logs

The kubelet logs on the node (e.g: sudo journalctl -r -u kubelet)
Any relevant kernel output lines from dmesg

Environment:

HAMi version:

root@node7vm-1:~/test# helm ls -A | grep hami
hami            kube-system     2               2024-11-14 15:18:36.886955318 +0800 CST deployed        hami-2.4.0                      2.4.0      
my-hami-webui   kube-system     4               2024-11-14 17:18:24.678439025 +0800 CST deployed        hami-webui-1.0.3                1.0.3

nvidia driver or other AI device driver version:

root@node7bm-1:~# nvidia-smi 
Thu Nov 14 15:58:33 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08             Driver Version: 535.161.08   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L40S                    On  | 00000000:08:00.0 Off |                  Off |
| N/A   27C    P8              22W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA L40S                    On  | 00000000:09:00.0 Off |                  Off |
| N/A   28C    P8              21W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA L40S                    On  | 00000000:0E:00.0 Off |                  Off |
| N/A   26C    P8              19W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA L40S                    On  | 00000000:11:00.0 Off |                  Off |
| N/A   26C    P8              21W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA L40S                    On  | 00000000:87:00.0 Off |                  Off |
| N/A   26C    P8              21W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   5  NVIDIA L40S                    On  | 00000000:8D:00.0 Off |                  Off |
| N/A   26C    P8              21W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   6  NVIDIA L40S                    On  | 00000000:90:00.0 Off |                  Off |
| N/A   26C    P8              21W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   7  NVIDIA L40S                    On  | 00000000:91:00.0 Off |                  Off |
| N/A   27C    P8              19W / 350W |      0MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Docker version from docker version
Docker command, image and tag used
Kernel version from uname -a

Linux node7vm-1 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Others:

if I don't request gpu mem:

Events:
  Type     Reason                    Age   From               Message
  ----     ------                    ----  ----               -------
  Normal   Scheduled                 19s   default-scheduler  Successfully assigned default/gpu-test-5f9f7d48d9-4wsrp to node7bm-1
  Warning  UnexpectedAdmissionError  20s   kubelet            Allocate failed due to rpc error: code = Unknown desc = no binding pod found on node node7bm-1, which is unexpected

if I request gpu mem:

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  14s   default-scheduler  0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Insufficient nvidia.com/gpumem. preemption: 0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No preemption victims found for incoming pod..

The text was updated successfully, but these errors were encountered:

Nimbus318 · 2024-11-15T02:44:47Z

Privileged Pods have direct access to the host's devices—they share the host's device namespace and can directly access everything under the /dev directory. This basically bypasses the container's device isolation

So, in our HAMi webhook:

	if ctr.SecurityContext.Privileged != nil && *ctr.SecurityContext.Privileged {
		klog.Warningf(template+" - Denying admission as container %s is privileged", req.Namespace, req.Name, req.UID, c.Name)
		continue
	}

the code just skips handling privileged Pods altogether, which means they fall back to being scheduled by the default scheduler. You can see from the Events you posted that it's scheduled by the default-scheduler

So, the reason scheduling fails when resources.limits includes nvidia.com/gpumem is that the default-scheduler doesn’t recognize nvidia.com/gpumem

gongysh2004 added the kind/bug Something isn't working label Nov 14, 2024

Nimbus318 mentioned this issue Nov 15, 2024

the node does not show gpumem allocatable resources #612

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

container with privilege context failed to be schedulered #611

container with privilege context failed to be schedulered #611

gongysh2004 commented Nov 14, 2024 •

edited

Loading

Nimbus318 commented Nov 15, 2024

container with privilege context failed to be schedulered #611

container with privilege context failed to be schedulered #611

Comments

gongysh2004 commented Nov 14, 2024 • edited Loading

Nimbus318 commented Nov 15, 2024

gongysh2004 commented Nov 14, 2024 •

edited

Loading