Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Create a pvc backuprepo, but got permission denied #6927

Closed
wutz opened this issue Mar 31, 2024 · 7 comments · Fixed by #6928
Closed

[BUG] Create a pvc backuprepo, but got permission denied #6927

wutz opened this issue Mar 31, 2024 · 7 comments · Fixed by #6928
Assignees
Labels
bug kind/bug Something isn't working
Milestone

Comments

@wutz
Copy link

wutz commented Mar 31, 2024

Describe the bug

A pvc backuprepo was created, but the pre-check permission was denied.

To Reproduce
Steps to reproduce the behavior:

  1. Create a storage class name shared-nvme which deployment by ceph-csi-cephfs
  2. kbcli backuprepo create --provider pvc --storage-class-name "shared-nvme" --access-mode "ReadWriteMany" --volume-capacity "1Ti" --default

Expected behavior

The backuprepo is created successfully.

Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • Kubernetes: v1.28.7+k3s1
  • KubeBlocks: 0.8.2
  • kbcli: 0.8.2

Additional context

Job failure message: BackoffLimitExceeded:Job has reached the specified backoff limit

Logs from the pre-check job:
  sh: can't create /backup/precheck.txt: Permission denied

Events from Pod/kb-system/pre-check-bf71d55e-backuprepo-phwsw-4fks6:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  7s    default-scheduler  Successfully assigned kb-system/pre-check-bf71d55e-backuprepo-phwsw-4fks6 to mn03.zw1.local
  Normal  Pulled     3s    kubelet            Container image "infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:0.8.2" already present on machine
  Normal  Created    3s    kubelet            Created container pre-check
  Normal  Started    3s    kubelet            Started container pre-check

Events from PersistentVolumeClaim/kb-system/pre-check-bf71d55e-backuprepo-phwsw:
  Type    Reason                 Age   From                                                                                                          Message
  ----    ------                 ----  ----                                                                                                          -------
  Normal  ExternalProvisioning   43s   persistentvolume-controller                                                                                   Waiting for a volume to be created either by the external provisioner 'cephfs.csi.ceph.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
  Normal  Provisioning           43s   cephfs.csi.ceph.com_cephfs-ceph-csi-cephfs-provisioner-7c995964d8-9kw27_ea3ca64e-b582-4ea6-a413-c4c2ab6fdd82  External provisioner is provisioning volume for claim "kb-system/pre-check-bf71d55e-backuprepo-phwsw"
  Normal  ProvisioningSucceeded  43s   cephfs.csi.ceph.com_cephfs-ceph-csi-cephfs-provisioner-7c995964d8-9kw27_ea3ca64e-b582-4ea6-a413-c4c2ab6fdd82  Successfully provisioned volume pvc-2ad531be-34bf-4366-9492-453d917afc17

Events from Job/kb-system/pre-check-bf71d55e-backuprepo-phwsw:
  Type     Reason                Age   From            Message
  ----     ------                ----  ----            -------
  Normal   SuccessfulCreate      43s   job-controller  Created pod: pre-check-bf71d55e-backuprepo-phwsw-w5m9f
  Normal   SuccessfulCreate      28s   job-controller  Created pod: pre-check-bf71d55e-backuprepo-phwsw-db5vc
  Normal   SuccessfulCreate      7s    job-controller  Created pod: pre-check-bf71d55e-backuprepo-phwsw-4fks6
  Warning  BackoffLimitExceeded  0s    job-controller  Job has reached the specified backoff limit
@wutz wutz added the kind/bug Something isn't working label Mar 31, 2024
@zjx20
Copy link
Contributor

zjx20 commented Mar 31, 2024

It appears that our pre-check job lacks the permissions to write to the volume. Could you please follow the steps below to reproduce the issue?

  1. Create a PVC using the shared-nvme StorageClass:
kubectl apply -f - <<-EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-cephfs
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 20Gi
  storageClassName: shared-nvme
  volumeMode: Filesystem
EOF
  1. Mount this PVC in a pod:
kubectl apply -f - <<"EOF"
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: mounter
    image: ubuntu:24.04
    command: ["/bin/bash", "-c"]
    args:
    - |
      echo hello > /data/hello.txt
      cat /data/hello.txt
    volumeMounts:
    - mountPath: /data
      name: data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: test-cephfs
EOF
  1. Observe the log output for any error messages similar to "permission denied":
kubectl logs test-pod -f

If the above steps can reproduce the issue, I suspect that there might be an issue with your ceph-csi-cephfs configuration.

@zjx20 zjx20 assigned zjx20 and unassigned nayutah Mar 31, 2024
@wutz
Copy link
Author

wutz commented Mar 31, 2024

$ kubectl logs test-pod -f
hello

@zjx20 It's working properly.

@wutz
Copy link
Author

wutz commented Mar 31, 2024

I replace test pod image to infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:0.8.2 and bash to sh

$ kubectl logs -f test-pod2
/bin/sh: can't create /data/hello.txt: Permission denied
hello

@wutz
Copy link
Author

wutz commented Mar 31, 2024

 $ id
uid=65532 gid=0(root) groups=0(root)

uid is 65532 in the image infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:0.8.2

@wutz
Copy link
Author

wutz commented Mar 31, 2024

Perhaps it need to set this uid 0 when running pre-check pod

securityContext:
      runAsUser: 0

@zjx20
Copy link
Contributor

zjx20 commented Mar 31, 2024

Perhaps it need to set this uid 0 when running pre-check pod

securityContext:
      runAsUser: 0

Thank you for providing this, I will make a fix. However, there is no quick workaround for the current version (v0.8.2), so you have to wait for the next release for the fix, sorry for the inconvenience.

@wutz
Copy link
Author

wutz commented Mar 31, 2024

For a workaround:

  1. build a new kubeblocks tools image with default user root
$ cat << 'EOF' > Dockerfile
FROM infracreate-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:0.8.2

User root
EOF
$ docker build -t ghcr.io/wutz/kubeblocks-tools:0.8.2 .
$ docker push ghcr.io/wutz/kubeblocks-tools:0.8.2
  1. edit deployment kubeblocks-dataprotection and update KUBEBLOCKS_TOOLS_IMAGE values to ghcr.io/wutz/kubeblocks-tools:0.8.2 (or use a ghcr proxy by nju.edu.cn ghcr.nju.edu.cn/wutz/kubeblocks-tools:0.8.2)

@github-actions github-actions bot added this to the Release 0.9.0 milestone Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants