Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix StatefulSet Update Logic to Avoid Forbidden Field Updates #2214

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fengyinqiao
Copy link
Contributor

@fengyinqiao fengyinqiao commented Jul 16, 2024

Description

This PR addresses an issue with the previous implementation of updating a StatefulSet, where the entire expectedStatefulSet was being directly updated to Kubernetes. This approach often resulted in update failures since Kubernetes only allows specific fields to be updated in a StatefulSet (replicas, ordinals, template, updateStrategy, persistentVolumeClaimRetentionPolicy, and minReadySeconds).

The new implementation introduces a function, UpdateStatefulSetSpec, which ensures that only the permissible fields are updated. Additionally, it preserves specific legacy labels, such as the zone label, to maintain backward compatibility.

Changes

  • Introduced a new function UpdateStatefulSetSpec that updates only the allowed fields of a StatefulSet.
  • Ensures legacy labels are carried over during the update process.
  • Updated existing code to use the new function for StatefulSet updates.

Function Documentation

// UpdateStatefulSetSpec updates the spec of an existing StatefulSet with the spec of the expected StatefulSet.
// It only updates the fields that are allowed to be updated in Kubernetes StatefulSets, which include replicas,
// ordinals, template, updateStrategy, persistentVolumeClaimRetentionPolicy, and minReadySeconds. Additionally,
// it ensures that specific legacy labels, such as the zone label, are carried over to the new spec.
//
// Parameters:
// - existingStatefulSet: The current StatefulSet object.
// - expectedStatefulSet: The desired StatefulSet object with the updated spec.
//
// Returns:
// - A new StatefulSet object with the updated spec and preserved legacy labels.
func UpdateStatefulSetSpec(existingStatefulSet *appsv1.StatefulSet, expectedStatefulSet *appsv1.StatefulSet) *appsv1.StatefulSet {
    // for legacy reasons, if the zone label is present in SS we must carry it over
    carryOverLabels := make(map[string]string)
    if val, ok := existingStatefulSet.Spec.Template.ObjectMeta.Labels[miniov1.ZoneLabel]; ok {
        carryOverLabels[miniov1.ZoneLabel] = val
    }

    newStatefulSet := existingStatefulSet.DeepCopy()

    newStatefulSet.Spec.Template = expectedStatefulSet.Spec.Template
    newStatefulSet.Spec.UpdateStrategy = expectedStatefulSet.Spec.UpdateStrategy

    if existingStatefulSet.Spec.Template.ObjectMeta.Labels == nil {
        newStatefulSet.Spec.Template.ObjectMeta.Labels = make(map[string]string)
    }
    for k, v := range carryOverLabels {
        newStatefulSet.Spec.Template.ObjectMeta.Labels[k] = v
    }
    return newStatefulSet
}

Issue Reference

Testing

  • Manually tested the new implementation to ensure that only the allowed fields are updated.
  • Verified that legacy labels are preserved during the update.

Additional Notes

  • This change should improve the reliability of StatefulSet updates and prevent update failures due to forbidden field modifications.

@jiuker @cniackz Please review the changes and let me know if any adjustments are needed. Thank you!

Copy link
Contributor

@ramondeklein ramondeklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the "expected" statefulset is created based on changes to other resources. Can't we prevent changes to these resources if they would incur changes to the statefulset? Now changes are not made and users may have no clue why their change isn't propagating to the statefulset...

PS: Code looks fine.

@fengyinqiao
Copy link
Contributor Author

I guess the "expected" statefulset is created based on changes to other resources. Can't we prevent changes to these resources if they would incur changes to the statefulset? Now changes are not made and users may have no clue why their change isn't propagating to the statefulset...

PS: Code looks fine.

@ramondeklein So do you mean that you want to disable the modification of some fields in the operator console (such as tenant.spec.podManagementPolicy field), or set validation rules in the CRD to disable the modification, or set validationWebhook to disable the modification?

@ramondeklein
Copy link
Contributor

@fengyinqiao Yes, if we can check if a change in a CRD field would lead to an invalid statefulset, then it would be better to disallow the user from making that change. That would result in an error and the user can deal with that. If that change is allowed to the CRDs, but it won't have any effect to the statefulset, the user will wonder why it won't work.

PS: Don't worry about the operator console UI. It has been removed in #2205.

@fengyinqiao
Copy link
Contributor Author

@ramondeklein I checked it, and the following fields will be ignored in the update of the STS after this PR is merged:

  1. tenant.spec.pools[*].name
    Modifications to this field will trigger modifications to sts.spec.selector, and modifications to the selector field will be ignored after this PR is merged because sts prohibits modifications to this field.
    Selector: ContainerMatchLabels(t, pool),
  2. tenant.spec.podManagementPolicy
    Modifications to this field will trigger modifications to sts.spec.podManagementPolicy, and modifications to the podManagementPolicy field will be ignored after this PR is merged because sts prohibits modifications to this field.
    PodManagementPolicy: t.Spec.PodManagementPolicy,

Then the modification of these two fields is prohibited in the CRD. Do you agree or not? The modification method is as follows:

type Pool struct {
        // +kubebuilder:validation:XValidation:rule="self == oldSelf",message="name is immutable"
	Name string `json:"name"`
}

type TenantSpec struct {
        // +kubebuilder:validation:XValidation:rule="self == oldSelf",message="podManagementPolicy is immutable"
	PodManagementPolicy appsv1.PodManagementPolicyType `json:"podManagementPolicy,omitempty"`
}

@ramondeklein
Copy link
Contributor

@fengyinqiao I'm not an operator specialist, so I'm not the one to decide. I just look at what I would expect as a user of the operator.

@jiuker @cniackz I think you're more qualified to comment on this...

@jiuker
Copy link
Contributor

jiuker commented Jul 16, 2024

@pjuarezd cc

@sebhoss
Copy link

sebhoss commented Jul 23, 2024

I think what needs to be done here and in #2196 is an orphaned delete of the statefulset which keeps the pods around and then a re-creation of the statefulset with the updated values. See https://prometheus-operator.dev/docs/platform/storage/#resizing-volumes for reference

@fengyinqiao
Copy link
Contributor Author

I think what needs to be done here and in #2196 is an orphaned delete of the statefulset which keeps the pods around and then a re-creation of the statefulset with the updated values. See https://prometheus-operator.dev/docs/platform/storage/#resizing-volumes for reference

@sebhoss Read your link, this is a manual operation and maintenance of the operation process, which seems inappropriate to do it in the operator, or do we want to support these manual operations in the same way?

In addition, after the sts re-creation, the ownerReference of the pod still points to the old sts. Will there be some problems, such as the pod will not be automatically deleted after the tenant or the new sts is deleted, which will cause resource leakage?

@sebhoss
Copy link

sebhoss commented Jul 26, 2024

@fengyinqiao yeah sorry I was unclear I want I meant & I'm not a maintainer of this repo - just a user who stumbled across this PR randomly 😅

I think the idea behind #2196 is fantastic because it is much better to do this resize PVC dance automatically by an operator than do it manually as we have to do with the prometheus-operator. However the implementation is flawed because it needs to do 3 steps but only does 1 so far:

  1. check that the storage class allows resizes
  2. resize pvc
  3. somehow modify the statefulset to reflect the new storage capacity

Since 3. is not implemented you run into a problem that you are trying to solve with this PR. Instead of ignoring certain field updates while updating a statefuset, I think you should adopt what the prometheus-operator people have documented: Delete the sts using --cascade=orphan to keep the pods running & re-create sts with the updated fields. While doing that all currently running pods keep running. The owner reference points to the newly created sts and the sts holds the same storage capacity value as the tenant it originated from.

I think implementing a validating webhook is a nice idea as well, e.g. it could check whether the used storage class allows resize operations while users are modifying the tenant resource and reject such modification if resize is not possible. Likewise that webhook could prohibit every other update to a tenant that would result in a fatal change to the sts like point 1 from #2214 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants