Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Getting started" example leads to underreplicated cluster #235

Closed
michael-ylb opened this issue Aug 22, 2024 · 2 comments
Closed

"Getting started" example leads to underreplicated cluster #235

michael-ylb opened this issue Aug 22, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@michael-ylb
Copy link

Describe the bug
When you set the cpu requests above cpu limits (using the Operator), the operator will terminate the replicas and end up in an unstable state.

To Reproduce
Steps to reproduce the behavior:
Follow these steps (I left out "Pass custom Dragonfly arguments"): https://www.dragonflydb.io/docs/getting-started/kubernetes-operator
The last step will be:
kubectl patch dragonfly dragonfly-sample --type merge -p '{"spec":{"resources":{"requests":{"cpu":"2"}}}}'

The cpu requests for the "dragonfly" resource will be above the cpu limits (600m). The operator applies these settings to the StatefulSet. The Statefulset will terminate the replicas, but is not able to create new pods:
"create Pod dragonfly-sample-1 in StatefulSet dragonfly-sample failed error: Pod "dragonfly-sample-1" is invalid: spec.containers[0].resources.requests: Invalid value: "2": must be less than or equal to cpu limit"

The operator will stay in this state and will not update the Statefulset, even when the cpu requests are set to a valid number again.

Expected behavior
The operator should never apply any operation that leads to this state. It should be able to recover.

Environment (please complete the following information):

  • OS: Ubuntu 22.04
  • Kernel: 6.8.0-40-generic dragonflydb/dragonfly#40~22.04.3-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 30 17:30:19 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
  • Containerized?: Kubernetes 1.21.14 in minikube
  • Dragonfly Version: 1.21.2
  • Operator Version: 1.1.7
@michael-ylb michael-ylb added the bug Something isn't working label Aug 22, 2024
@dranikpg dranikpg transferred this issue from dragonflydb/dragonfly Aug 22, 2024
@Abhra303
Copy link
Contributor

Abhra303 commented Aug 22, 2024

Hi @michael-ylb , the pod termination is the expected behaviour as it exceeds the cpu limit. However, the operator ideally should recreate new pods once the crd is updated with configuration. There is already a ticket #165 for this and the fix will be patched in the next release. I am closing this w.r.t that ticket.

@Abhra303
Copy link
Contributor

I will update the docs in the meantime. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants