Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]orioledb cluster created failed with Back-off restarting failed container error #8760

Open
tianyue86 opened this issue Jan 8, 2025 · 1 comment
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@tianyue86
Copy link

Describe the env
Kubernetes: v1.31.1-aliyun.1
KubeBlocks: 1.0.0-beta.21
kbcli: 1.0.0-beta.8

To Reproduce
Steps to reproduce the behavior:

  1. Get orioledb cluster yaml and apply it
helm template orioc2 ./addons-cluster/orioledb --version 1.0.0-alpha.0
---
# Source: orioledb-cluster/templates/cluster.yaml
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: orioc2
  namespace: default
  labels:
    helm.sh/chart: orioledb-cluster-1.0.0-alpha.0
    app.kubernetes.io/version: "14.7.2"
    app.kubernetes.io/instance: orioc2
spec:
  terminationPolicy: Delete
  clusterDef: orioledb
  topology: replication
  componentSpecs:
    - name: orioledb
      serviceVersion: 14.7.2
      labels:       
        apps.kubeblocks.postgres.patroni/scope: orioc2-orioledb     
      disableExporter: true     
      replicas: 2     
      resources:
        limits:
          cpu: "0.5"
          memory: "0.5Gi"
        requests:
          cpu: "0.5"
          memory: "0.5Gi"     
      volumeClaimTemplates:
        - name: data # ref clusterDefinition components.containers.volumeMounts.name
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
  1. The created cluster is failed
k get cluster -A
NAMESPACE   NAME            CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
default     orioc2          orioledb             Delete               Failed     24m

k get pod
NAME                             READY   STATUS             RESTARTS           AGE
orioc2-orioledb-0                4/5     CrashLoopBackOff   9 (3m ago)         24m
  1. Describe pod
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               14m                   default-scheduler        Successfully assigned default/orioc2-orioledb-0 to cn-zhangjiakou.10.0.0.140
  Normal   SuccessfulAttachVolume  14m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "d-8vbbrthcj7220y1reqkk"
  Normal   AllocIPSucceed          14m                   terway-daemon            Alloc IP 10.0.0.10/24 took 38.526245ms
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/orioledb:beta1" already present on machine
  Normal   Created                 14m                   kubelet                  Created container pg-init-container
  Normal   Started                 14m                   kubelet                  Started container pg-init-container
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/dbctl:0.1.5" already present on machine
  Normal   Created                 14m                   kubelet                  Created container init-dbctl
  Normal   Started                 14m                   kubelet                  Started container init-dbctl
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.0-beta.21" already present on machine
  Normal   Created                 14m                   kubelet                  Created container init-kbagent
  Normal   Started                 14m                   kubelet                  Started container init-kbagent
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/orioledb:beta1" already present on machine
  Normal   Created                 14m                   kubelet                  Created container kbagent-worker
  Normal   Started                 14m                   kubelet                  Started container kbagent-worker
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/orioledb:beta1" already present on machine
  Normal   Created                 14m                   kubelet                  Created container postgresql
  Normal   Started                 14m                   kubelet                  Started container postgresql
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/pgbouncer:1.19.0" already present on machine
  Normal   Created                 14m                   kubelet                  Created container pgbouncer
  Normal   Started                 14m                   kubelet                  Started container pgbouncer
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/agamotto:0.1.0-beta.5" already present on machine
  Normal   Created                 14m                   kubelet                  Created container metrics
  Normal   Started                 14m                   kubelet                  Started container metrics
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/orioledb:beta1" already present on machine
  Normal   Created                 14m                   kubelet                  Created container kbagent
  Normal   Started                 14m                   kubelet                  Started container kbagent
  Normal   Pulled                  14m                   kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.0-beta.21" already present on machine
  Warning  BackOff                 4m26s (x53 over 14m)  kubelet                  Back-off restarting failed container postgresql in pod orioc2-orioledb-0_default(c0081a9f-c79a-4a9d-9080-24eb18b0ea38)
  1. Container logs
PostgreSQL Database directory appears to contain a database; Skipping initialization

2025-01-08 03:30:31,062 INFO: Failed to import patroni.dcs.consul
2025-01-08 03:30:31,138 INFO: Failed to import patroni.dcs.exhibitor
2025-01-08 03:30:31,140 INFO: Failed to import patroni.dcs.raft
2025-01-08 03:30:31,140 INFO: Failed to import patroni.dcs.zookeeper
Traceback (most recent call last):
  File "/usr/bin/patroni", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.10/site-packages/patroni/__main__.py", line 191, in main
    return patroni_main(args.configfile)
  File "/usr/lib/python3.10/site-packages/patroni/__main__.py", line 162, in patroni_main
    abstract_main(Patroni, configfile)
  File "/usr/lib/python3.10/site-packages/patroni/daemon.py", line 172, in abstract_main
    controller = cls(config)
  File "/usr/lib/python3.10/site-packages/patroni/__main__.py", line 32, in __init__
    self.dcs = get_dcs(self.config)
  File "/usr/lib/python3.10/site-packages/patroni/dcs/__init__.py", line 118, in get_dcs
    raise PatroniFatalException("""Can not find suitable configuration of distributed configuration store
patroni.exceptions.PatroniFatalException: Can not find suitable configuration of distributed configuration store
Available implementations: etcd, etcd3, kubernetes

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@shanshanying
Copy link
Contributor

known issue. If cannot fixed before release 1.0, we can remove this add from Relase 1.0 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants