Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]kb crash after stop cluster after update monitor #6922

Closed
ahjing99 opened this issue Mar 29, 2024 · 3 comments
Closed

[BUG]kb crash after stop cluster after update monitor #6922

ahjing99 opened this issue Mar 29, 2024 · 3 comments
Assignees
Labels
bug kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Milestone

Comments

@ahjing99
Copy link
Collaborator

ahjing99 commented Mar 29, 2024

➜ ~ kbcli version
Kubernetes: v1.27.8-gke.1067004
KubeBlocks: 0.9.0-alpha.7
kbcli: 0.9.0-alpha.2

  1. Stop cluster without update monitor works
  `kbcli cluster create  weaviate-ctqacs --termination-policy=WipeOut --monitoring-interval=0 --cluster-definition=weaviate --enable-all-logs=false --cluster-version=weaviate-1.18.0 --set cpu=100m,memory=0.5Gi,replicas=1,storage=1Gi  --namespace default `

Cluster weaviate-ctqacs created

➜  ~ kbcli cluster hscale weaviate-ctqacs  --auto-approve --components weaviate --replicas 3
OpsRequest weaviate-ctqacs-horizontalscaling-cpbvc created successfully, you can view the progress:
	kbcli cluster describe-ops weaviate-ctqacs-horizontalscaling-cpbvc -n default

➜  ~ kbcli cluster stop weaviate-ctqacs
Please type the name again(separate with white space when more than one): weaviate-ctqacs
OpsRequest weaviate-ctqacs-stop-chvw9 created successfully, you can view the progress:
	kbcli cluster describe-ops weaviate-ctqacs-stop-chvw9 -n default

➜  ~ kbcli cluster start weaviate-ctqacs
OpsRequest weaviate-ctqacs-start-nlvw8 created successfully, you can view the progress:
	kbcli cluster describe-ops weaviate-ctqacs-start-nlvw8 -n default

  1. Update monitor then stop, kb crash
➜  ~ kbcli cluster update weaviate-ctqacs --monitoring-interval=1
cluster.apps.kubeblocks.io/weaviate-ctqacs updated
➜  ~  kbcli cluster stop weaviate-ctqacs
Please type the name again(separate with white space when more than one): weaviate-ctqacs
OpsRequest weaviate-ctqacs-stop-cbrsr created successfully, you can view the progress:
	kbcli cluster describe-ops weaviate-ctqacs-stop-cbrsr -n default

➜  ~ k get ops weaviate-ctqacs-stop-cbrsr
NAME                         TYPE   CLUSTER           STATUS    PROGRESS   AGE
weaviate-ctqacs-stop-cbrsr   Stop   weaviate-ctqacs   Running   2/3        23m

2024-03-29T03:26:01.589Z	INFO	Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference  	{"controller": "replicatedstatemachine", "controllerGroup": "workloads.kubeblocks.io", "controllerKind": "ReplicatedStateMachine", "ReplicatedStateMachine": {"name":"weaviate-ctqacs-weaviate","namespace":"default"}, "namespace": "default", "name": "weaviate-ctqacs-weaviate", "reconcileID": "fc7d834a-6384-433f-aa64-2a0b830f35a7"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x23f0290]

goroutine 740 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:115 +0x1e5
panic({0x2708280?, 0x48c3180?})
	/usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/apecloud/kubeblocks/pkg/controller/rsm2.(*replicasAlignmentReconciler).Reconcile(0x3169318?, 0xc00011fb40)
	/src/pkg/controller/rsm2/reconciler_replicas_alignment.go:153 +0x9b0
github.com/apecloud/kubeblocks/pkg/controller/kubebuilderx.(*controller).Do(0xc00199bb00, {0xc002d07da0?, 0x1, 0xc00017f320?})
	/src/pkg/controller/kubebuilderx/controller.go:85 +0x151
github.com/apecloud/kubeblocks/controllers/workloads.(*ReplicatedStateMachineReconciler).Reconcile(0xc000a248d0, {0x31632d8?, 0xc001da3c50}, {{{0xc000ed06a0, 0x7}, {0xc000d8be00, 0x18}}})
	/src/controllers/workloads/replicatedstatemachine_controller.go:90 +0x5b5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x31632d8?, {0x31632d8?, 0xc001da3c50?}, {{{0xc000ed06a0?, 0x2584980?}, {0xc000d8be00?, 0xc0004b4740?}}})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00097ce60, {0x3163310, 0xc0006de8c0}, {0x2837e60?, 0xc0004b4740?})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:314 +0x368
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00097ce60, {0x3163310, 0xc0006de8c0})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:265 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:226 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 94
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x565

➜  ~ kbcli report cluster   --with-logs --all-containers   weaviate-ctqacs
reporting cluster information to report-cluster-weaviate-ctqacs-2024-03-29-11-34-12.zip
processing manifests                               OK
processing events                                  OK
process pod logs                                   OK
➜  ~ kbcli report kubeblocks --with-logs --all-containers --output yaml
reporting KubeBlocks information to report-kubeblocks-2024-03-29-11-34-24.zip
processing manifests                               OK
processing events                                  OK
process pod logs                                   OK
➜  ~
@ahjing99 ahjing99 added the kind/bug Something isn't working label Mar 29, 2024
@ahjing99 ahjing99 added this to the Release 0.9.0 milestone Mar 29, 2024
@ahjing99
Copy link
Collaborator Author

@ahjing99
Copy link
Collaborator Author

openldap aslo has the same problem when stop after create (without enable monitor)

`kbcli cluster create  oldap-kvnsgj --termination-policy=DoNotTerminate --monitoring-interval=0 --cluster-definition=openldap --enable-all-logs=false --cluster-version=openldap-2.4.57 --set cpu=100m,memory=0.5Gi,replicas=2,storage=1Gi  --namespace default `

Cluster oldap-kvnsgj created

  `kbcli cluster stop oldap-kvnsgj --auto-approve  --namespace default `

OpsRequest oldap-kvnsgj-stop-xcc7m created successfully, you can view the progress:
	kbcli cluster describe-ops oldap-kvnsgj-stop-xcc7m -n default

2024-03-29T06:04:25.050Z	INFO	Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference  	{"controller": "replicatedstatemachine", "controllerGroup": "workloads.kubeblocks.io", "controllerKind": "ReplicatedStateMachine", "ReplicatedStateMachine": {"name":"oldap-kvnsgj-openldap-compdef","namespace":"default"}, "namespace": "default", "name": "oldap-kvnsgj-openldap-compdef", "reconcileID": "2d6b03a0-a6b0-40f5-ac04-5f480ed70b01"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x23f0290]

goroutine 827 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:115 +0x1e5
panic({0x2708280?, 0x48c3180?})
	/usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/apecloud/kubeblocks/pkg/controller/rsm2.(*replicasAlignmentReconciler).Reconcile(0x3169318?, 0xc0032c1fc0)
	/src/pkg/controller/rsm2/reconciler_replicas_alignment.go:153 +0x9b0
github.com/apecloud/kubeblocks/pkg/controller/kubebuilderx.(*controller).Do(0xc00088d8c0, {0xc002d6a7d0?, 0x1, 0xc000138630?})
	/src/pkg/controller/kubebuilderx/controller.go:85 +0x151
github.com/apecloud/kubeblocks/controllers/workloads.(*ReplicatedStateMachineReconciler).Reconcile(0xc000ad2960, {0x31632d8?, 0xc0037128a0}, {{{0xc004271265, 0x7}, {0xc0036bf580, 0x1d}}})
	/src/controllers/workloads/replicatedstatemachine_controller.go:90 +0x5b5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x31632d8?, {0x31632d8?, 0xc0037128a0?}, {{{0xc004271265?, 0x2584980?}, {0xc0036bf580?, 0xc00349b980?}}})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000ab4c80, {0x3163310, 0xc0006b68c0}, {0x2837e60?, 0xc00349b980?})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:314 +0x368
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000ab4c80, {0x3163310, 0xc0006b68c0})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:265 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:226 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 153
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x565
➜  ~

@ahjing99 ahjing99 added the severity/major Great chance user will encounter the same problem label Apr 1, 2024
@free6om
Copy link
Contributor

free6om commented Apr 15, 2024

fixed by #6958

@free6om free6om closed this as completed Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug kind/bug Something isn't working severity/major Great chance user will encounter the same problem
Projects
None yet
Development

No branches or pull requests

3 participants