-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New version of zdm-proxy helm chart using StatefulSets #89
base: main
Are you sure you want to change the base?
Conversation
…rearchy of some helm values
…, so helm is no longer managing it
…s.yaml, including logLevel, readMode, primaryCluster, etal
…o-started, add livenessProbe and readinessProbe to proxy pods
I was going through the PR as we are planing to utilize the helm chart to deploy the zdm-proxy to our clusters. I have few questions about this PR, it seems to me that you have added a new stateful set and set the services to point to the pods in the stateful set. However the deployment is still there, so if I understand this correctly, alongside the new statefulset the zdm proxy is also going to be deployed as a deployment, but no service will point to the deployment(s) anymore. I'm also not 100% clear why the CDM deployment has been turned into a simple pod. We are using preemptive instances and they are recycled frequently. Based on kubernetes docs if the node is going away the pod is gone. |
@lorantfecske-bud The latest commit in this branch |
@lorantfecske-bud We're planning to take CDM (Cassandra Data Migrator) out of the zdm-proxy helm chart, as it is a completely separate open source project https://github.com/datastax/cassandra-data-migrator and needs to have its own deployment and lifecycle management approach. For now, you can use the following command to run CDM in your k8s cluster without the helm chart:
|
@@ -21,6 +21,6 @@ spec: | |||
name: cql | |||
selector: | |||
{{- $zdm_selectorLabels | nindent 4 }} | |||
app: {{ $zdm_fullname }}-{{ $index }} | |||
statefulset.kubernetes.io/pod-name: {{ $zdm_fullname }}-{{ $index }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you trying to match a per-pod label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsanda The reason for us to create individual services and match them 1:1 at pod level is because unlike C*, zdm-proxy processes don't have gossip protocol to detect/notify topology changes among themselves; whatever set of IP addresses and their respective ordinal index (see here) specified at the start of the proxy process is going to stuck for the rest of its lifecycle. To allow dynamical IP address change to the pods due to reschedule/cordon/crash, we decided to for N number of proxies, implement N number of services, and them map 1-on-1. Given that pods managed by k8s Deployment object doesn't have a static pod name, we switched to using StatefulSets. In StatefulSets, each pod has a unique/static pod-name that we can map the service to. This will allow orderly rolling restart of the zdm-proxy pods to happen.
@weideng1 thanks for then answers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Popping some comments in, but no glaring blockers from me. I expect @jsanda will handle proper approval.
selector: | ||
matchLabels: | ||
{{- $zdm_selectorLabels | nindent 6 }} | ||
app: {{ $zdm_fullname }}-{{ $index }} | ||
app: {{ $zdm_fullname }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider these recommended labels from the k8s docs.
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
- sh | ||
- "-c" | ||
- "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's surprising that you're specifying a command here vs just passing in arguments to the command specified in the Docker image (or relying on arguments there).
- sh | ||
- "-c" | ||
- "ZDM_PROXY_TOPOLOGY_INDEX=`echo ${HOSTNAME##*-}` /main" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to populate an environment variable with a fieldRef
from metadata.name
. See https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it looks like you're extracting the ordinal index which is not available to the pod directly. It's frustrating that this isn't available. (I see the link to kubernetes/kubernetes#40651 as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bradfordcp Yes exactly. That was the route I explored and ended up using the workaround as you pointed out. Should we resolve this comment?
@jimdickinson @bradfordcp when you're back from the holidays, could you please help to take a look at the helm chart in this PR? Compared to the original version, the main improvement in this PR is that I've now started to use STS to manage all proxy pods, while keeping the services to route traffic to individual proxy pods via their names. This enables regular rolling restart using
kubectl -n zdmproxy rollout restart sts/zdm-proxy
with the appropriate wait time in between. Changes to values.yaml or--set
values will trigger all proxy pods to roll-restart, so we can easily scale out and scale in.I also removed configmap.yaml as all of the values in there can be directly retrieved from helm chart's values.yaml. This way, changes to these values can be directly detected by StatefulSets and automatically trigger rolling restart of all proxy pods without causing any downtime.