Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]kb crashed on GKE #7080

Closed
ahjing99 opened this issue Apr 17, 2024 · 0 comments
Closed

[BUG]kb crashed on GKE #7080

ahjing99 opened this issue Apr 17, 2024 · 0 comments
Assignees
Labels
kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Milestone

Comments

@ahjing99
Copy link
Collaborator

➜ ~ kbcli version
Kubernetes: v1.27.8-gke.1067004
KubeBlocks: 0.9.0-beta.8
kbcli: 0.9.0-beta.1

➜  ~ kbcli kubeblocks install --version=0.9.0-beta.8 --set multiCluster.kubeConfig=multik8s  --set multiCluster.contexts='gke_kubeblocks_us-central1-c_k8s-2\,gke_kubeblocks_us-central1-c_k8s-3'
KubeBlocks will be installed to namespace "kb-system"
Kubernetes version 1.27.8
Kubernetes provider GKE
kbcli version 0.9.0-beta.1
Collecting data from cluster                       OK
Kubernetes cluster preflight                       OK
  Warn
  - This application requires at least 3 nodes
Create CRDs                                        OK
Add and update repo kubeblocks                     OK
Install KubeBlocks 0.9.0-beta.8                    OK
Wait for addons to be enabled
  apecloud-mysql                                   OK
  clickhouse                                       OK
  kafka                                            OK
  mongodb                                          OK
  postgresql                                       OK
  pulsar                                           OK
  redis                                            OK
  snapshot-controller                              OK

KubeBlocks 0.9.0-beta.8 installed to namespace kb-system SUCCESSFULLY!

-> Basic commands for cluster:
    kbcli cluster create -h     # help information about creating a database cluster
    kbcli cluster list          # list all database clusters
    kbcli cluster describe <cluster name>  # get cluster information

-> Uninstall KubeBlocks:
    kbcli kubeblocks uninstall

➜  ~ k get pod -n kb-system
NAME                                            READY   STATUS             RESTARTS      AGE
kb-addon-snapshot-controller-64b4dcb6bc-tbnl8   1/1     Running            0             7m17s
kubeblocks-64d5d99c78-plthq                     0/1     CrashLoopBackOff   2 (11s ago)   7m37s
kubeblocks-dataprotection-76d67b9b6d-6nk6b      1/1     Running            0             7m37s

2024-04-17T04:52:27.851Z	ERROR	Reconciler error	{"controller": "event", "controllerGroup": "", "controllerKind": "Event", "Event": {"name":"redis.17c6f7a7a40e295f","namespace":"default"}, "namespace": "default", "name": "redis.17c6f7a7a40e295f", "reconcileID": "d46499f1-7889-4736-ad2f-d3bbd0738f78", "error": "failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://35.184.255.41/api/v1\": getting credentials: exec: executable gke-gcloud-auth-plugin not found"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
2024-04-17T04:52:27.851Z	INFO	All workers finished	{"controller": "event", "controllerGroup": "", "controllerKind": "Event"}
2024-04-17T04:52:27.851Z	INFO	Stopping and waiting for caches
2024-04-17T04:52:27.855Z	INFO	Stopping and waiting for webhooks
2024-04-17T04:52:27.855Z	INFO	Stopping and waiting for HTTP servers
2024-04-17T04:52:27.856Z	INFO	controller-runtime.metrics	Shutting down metrics server with timeout of 1 minute
2024-04-17T04:52:27.856Z	INFO	shutting down server	{"kind": "health probe", "addr": "[::]:8081"}
2024-04-17T04:52:27.856Z	INFO	Wait completed, proceeding to shutdown the manager
E0417 04:52:27.857092       1 leaderelection.go:369] Failed to update lock: Put "https://10.120.16.1:443/apis/coordination.k8s.io/v1/namespaces/kb-system/leases/001c317f.kubeblocks.io": context canceled
I0417 04:52:27.857154       1 leaderelection.go:285] failed to renew lease kb-system/001c317f.kubeblocks.io: timed out waiting for the condition
E0417 04:52:27.872540       1 leaderelection.go:308] Failed to release lock: Operation cannot be fulfilled on leases.coordination.k8s.io "001c317f.kubeblocks.io": the object has been modified; please apply your changes to the latest version and try again
2024-04-17T04:52:27.873Z	ERROR	error received after stop sequence was engaged	{"error": "leader election lost"}
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:490
2024-04-17T04:52:27.872Z	ERROR	setup	problem running manager	{"error": "failed to wait for configuration caches to sync: timed out waiting for cache to be synced for Kind *v1.ConfigMap"}
main.main
	/src/cmd/manager/main.go:602
runtime.main
	/usr/local/go/src/runtime/proc.go:267

➜ ~ k logs kubeblocks-64d5d99c78-plthq -n kb-system -p > kb.txt
'Defaulted container "manager" out of: manager, tools (init), datascript (init)
kb.txt

@ahjing99 ahjing99 added kind/bug Something isn't working severity/major Great chance user will encounter the same problem labels Apr 17, 2024
@ahjing99 ahjing99 added this to the Release 0.9.0 milestone Apr 17, 2024
@ahjing99 ahjing99 changed the title [BUG]kb crashed [BUG]kb crashed on GKE Apr 17, 2024
@leon-inf leon-inf closed this as not planned Won't fix, can't repro, duplicate, stale Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Projects
None yet
Development

No branches or pull requests

2 participants