Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ability to set topologySpreadConstraints to all component workloads #589

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

relu
Copy link
Contributor

@relu relu commented Jul 25, 2024

This should allow users to configure topologySpreadhConstraints for all operator-managed workloads.

@relu relu force-pushed the extend-tsc-usage branch from ce748b8 to ae4f0ba Compare July 25, 2024 20:26
Copy link
Member

@frouioui frouioui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the operator-latest.yaml file has to be updated with the latest CRD too. Additionally, a follow-up PR will have to be made on the vitess repository to update the given operator.yaml file in the examples.

@mattlord mattlord changed the title Extend ability to set topologySpreadhConstraints to all component workloads Extend ability to set topologySpreadConstraints to all component workloads Aug 1, 2024
Copy link
Collaborator

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @relu ! It looks good to me. I only had a few minor questions and suggestions. I can come back to this quickly when you have a chance to respond.

Thanks again!

Comment on lines 49 to 52
if tsc.MinDomains != nil {
minDomainsSeconds = fmt.Sprint(*tsc.MinDomains)
}
writeStringHash(h, minDomainsSeconds)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if tsc.MinDomains != nil {
minDomainsSeconds = fmt.Sprint(*tsc.MinDomains)
}
writeStringHash(h, minDomainsSeconds)
if tsc.MinDomains != nil {
writeStringHash(h, string(*tsc.MinDomains))
}

@@ -0,0 +1,64 @@
/*
Copyright 2020 PlanetScale Inc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use 2024 for new files.

Comment on lines -305 to -318
srcMap, srcErr := metav1.LabelSelectorAsMap(srcObj.LabelSelector)
if srcErr != nil {
// There is no return of error for this function
// or its callers, which makes this a bit of a hack
srcMap = map[string]string{}
}
// If this item is already there, update it.
for dstIndex := range *dst {
dstObj := &(*dst)[dstIndex]
dstMap, dstErr := metav1.LabelSelectorAsMap(dstObj.LabelSelector)
if dstErr != nil {
dstMap = map[string]string{}
}
if reflect.DeepEqual(dstMap, srcMap) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this code was added, so also not sure that this is safe. Reads to me like this was primarily about handling nils and invalid/unexpected data which we lose here. Can you please explain the reasoning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this was intended to work was that the list of corev1.TopologySpreadConstraint items would be matched exclusively on the labelSelector, causing any previously existing items having the same labelSelector to be overridden.
This behavior doesn't make sense because the labelSelector will most likely be the same across all topology spread constraint items. Instead, we should only append to the list and not touch any existing values.

I've simplified the code a bit and added some test examples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to do a DeepEqual here as it will catch any potential change to the Topology Spread Constraints.

However, I am a bit afraid of the performance implication that reflect.DeepEqual brings. Since we are calling it in a nested loop, if the user defines many constraints on their vttablet or etcd pods we might be left with a longer update. To avoid this potential issue, could we create a method to compare two constraints?

@frouioui
Copy link
Member

frouioui commented Oct 7, 2024

Hey @relu! Are you still working on this? It would be nice to include this in the upcoming v2.14.0 release of the vitess-operator if we can. The RC-1 is scheduled in a week, and GA late October.

Copy link
Member

@frouioui frouioui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the comments I have left it looks good to me. Do you have a manual E2E test for this, if so, could you share the script and output with us?

Also, please open a follow-up PR on vitessio/vitess to update the operator.yaml with what you changed on this PR in operator-latest.yaml.

@@ -1227,6 +1227,20 @@ ServiceOverrides
<p>Tolerations allow you to schedule pods onto nodes with matching taints.</p>
</td>
</tr>
<tr>
<td>
<code>topologySpreadConstraints</code></br>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we generate documentation changed in #625. We no longer use </br> as it can break in Markdown. Do you mind merging main onto your branch and re-generating the docs?

Comment on lines +46 to +55
writeStringHash(h, strings.Join(tsc.MatchLabelKeys, ""))
if tsc.MinDomains != nil {
writeStringHash(h, string(*tsc.MinDomains))
}
if tsc.NodeAffinityPolicy != nil {
writeStringHash(h, string(*tsc.NodeAffinityPolicy))
}
if tsc.NodeTaintsPolicy != nil {
writeStringHash(h, string(*tsc.NodeTaintsPolicy))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good catch. I am guessing the hash we were previously computing were not encapsulating all the information we could add?

Comment on lines +35 to +37
if nilHash == zeroHash {
t.Errorf("nilHash = zeroHash = %v; expected different values", nilHash)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking the equality manually we should use require.NotEqual. It is something we have adopted a while ago on vitessio/vitess. We should start doing the same for new tests on the operator. Same applies to TestTopologySpreadConstraintNodeAffinityPolicysNil and TestTopologySpreadConstraintNodeTaintsPolicysNil.

}

func TestTopologySpreadConstraintNodeAffinityPolicysNil(t *testing.T) {
// Make sure nil is distinguishable from 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment does not entirely make sense here, same for the following test

Comment on lines -305 to -318
srcMap, srcErr := metav1.LabelSelectorAsMap(srcObj.LabelSelector)
if srcErr != nil {
// There is no return of error for this function
// or its callers, which makes this a bit of a hack
srcMap = map[string]string{}
}
// If this item is already there, update it.
for dstIndex := range *dst {
dstObj := &(*dst)[dstIndex]
dstMap, dstErr := metav1.LabelSelectorAsMap(dstObj.LabelSelector)
if dstErr != nil {
dstMap = map[string]string{}
}
if reflect.DeepEqual(dstMap, srcMap) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to do a DeepEqual here as it will catch any potential change to the Topology Spread Constraints.

However, I am a bit afraid of the performance implication that reflect.DeepEqual brings. Since we are calling it in a nested loop, if the user defines many constraints on their vttablet or etcd pods we might be left with a longer update. To avoid this potential issue, could we create a method to compare two constraints?

@frouioui
Copy link
Member

frouioui commented Dec 4, 2024

Hi @relu! Any update on this PR? It has been left without update for more than a month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants