Skip to content

CreateOrUpdate performs Updates even if the object is unchanged #3191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ian-howell opened this issue Apr 6, 2025 · 11 comments · May be fixed by #3193
Open

CreateOrUpdate performs Updates even if the object is unchanged #3191

ian-howell opened this issue Apr 6, 2025 · 11 comments · May be fixed by #3193

Comments

@ian-howell
Copy link

ian-howell commented Apr 6, 2025

The controllerutil.CreateOrUpdate does not correctly handle updates.

Given the same resource multiple times, CreateOrUpdate will always perform an UPDATE api call, and report that it did so, even if there were no updates to the resource. This can be easily verified by making the following modifcation to the unit test

Image

This diff results in the following error:

Image

This is problematic:

  1. It result in unnecessary API calls,
  2. It reduces visibility into what a controller is doing. The function suggests that it will return OperationNone if no changes were made, however controller authors still need to manually check if their resource changed. This reduces the ability to properly log and emit metrics, as well as other problems such as those offered in CreateOrUpdate falsely claiming its updating resources  #847 (comment)

Root Cause in this comment:

#3191 (comment)

@ian-howell ian-howell changed the title CreateOrUpdate is not idempotent CreateOrUpdate performs Updates even if the object is unchanged Apr 6, 2025
@ian-howell
Copy link
Author

My gut reaction was that this would be related to changing ResourceVersions, but that doesn't appear to be the case. This diff results in the same failure as above (i.e., the ResourceVersion didn't change):

Image

@ian-howell
Copy link
Author

Oh this is even weirder. The Gomega Equal matcher is not finding any difference between the before and after. Notably, this diff uses the same DeepCopyObject method used inside of CreateOrUpdate. The Expect(deploy).To(Equal(prevDeploy)) passes, but the result check still fails:

Image

@ian-howell
Copy link
Author

CreateOrUpdate depends on equality.Semantic.DeepEqual to determine if the object was modified. This rabbit hole goes deeper though, because this addition to the above diff also passes:

Expect(equality.Semantic.DeepEqual(deploy, prevDeploy)).To(BeTrue())

Why does that function return true in the test, but false in CreateOrUpdate?

@ian-howell
Copy link
Author

This addition to CreateOrUpdate causes the test to pass (I would be very worried if it didn't...):

Image

So the issue must be occurring thanks to the mutateFn

@alvaroaleman
Copy link
Member

Please provide existing and obj, otherwise this is not actionable

@ian-howell
Copy link
Author

I didn't create existing and obj - they're already in CreateOrUpdate:

obj is the argument that gets passed to CreateOrUpdate.

existing is created inside the function.

All of the code I've shown is all in-repo. This commit on my fork demonstrates the issue.

@alvaroaleman
Copy link
Member

If you CreateOrUpdate the same object twice in rapid succession then it will always trigger an update yes. This is because the cache from which it reads is eventually consistent, so it will not see the result of the first CreateOrUpdate in the second and issue an update (or create).

@ian-howell
Copy link
Author

ian-howell commented Apr 7, 2025

I'm probably missing something, but it doesn't look like the test client uses a cache? Here's it's initialization:

c, err = client.New(cfg, client.Options{})

and here's the docstring for client.New:

// The client's read behavior is determined by Options.Cache.
// If either Options.Cache or Options.Cache.Reader is nil,
// the client reads directly from the API server.

Anyway, I did some more testing and I think I've figured it out.

The following is pulled from my branch. I recommend pulling and testing the branch, but I'll leave the contents here as well for convenience:

It("updates only changed objects", func() {                                                                                         
	// Prior to the test, deploy only has ObjectMeta. This is due to the BeforeEach.                                            
	op, err := controllerutil.CreateOrUpdate(context.TODO(), c, deploy, specr)                                                  
                                                                                                                                    
	// After the above CreateOrUpdate, deploy will have deplSpec attached by the specr                                          
	// function.                                                                                                                
	Expect(op).To(BeEquivalentTo(controllerutil.OperationResultCreated))                                                        
	Expect(err).NotTo(HaveOccurred())                                                                                           
                                                                                                                                    
	// Let's save a copy of deploy and then perform a no-op on it. We expect that existing                                      
	// will be equal to deploy after the no-op.                                                                                 
	existing := deploy.DeepCopyObject()                                                                                         
	// For sanity, make sure that DeepCopyObject actually worked.                                                               
	Expect(existing).To(Equal(deploy))                                                                                          
                                                                                                                                    
	// Call specr. This this is the same function called from CreateOrUpdate, we expect                                         
	// that it will be a no-op.                                                                                                 
	err = specr()                                                                                                               
	Expect(err).NotTo(HaveOccurred())                                                                                           
                                                                                                                                    
	// However...                                                                                                               
	Expect(existing).To(Equal(deploy))                                                                                          
                                                                                                                                    
	// The above function fails, but fortunately for us, it also prints the differences.                                        
	// It looks something like this (the common lines have been removed):                                                       
	//                                                                                                                          
	//    <*v1.Deployment | 0xc000543908>: {                      |       <*v1.Deployment | 0xc00014ef08>: {                    
	//            Replicas: 1,                                    |               Replicas: nil,                                
	//                            TerminationMessagePath: "/dev/t |                               TerminationMessagePath: "",   
	//                            TerminationMessagePolicy: "File |                               TerminationMessagePolicy: "", 
	//                            ImagePullPolicy: "Always",      |                               ImagePullPolicy: "",          
	//                                                                                                                          
	// I took a look at the API for deployment, and it turns out that Replicas will                                             
	// default to 1 if not set. The remaining fields are part of the Container API type,                                        
	// and also have default values that align with the above.                                                                  
	//                                                                                                                          
	// We know that the second specr should be a no-op, so what else could be going on?                                         
	// Let's revisit the *first* call to CreateOrUpdate. We made the assumption that after                                      
	// that call, the only contents of deploy would be the ObjectMeta and deplSpec that we                                      
	// defined. However, we can see from the above output that this isn't true. The                                             
	// contents of deploy (at the time) included the default values. I believe this                                             
	// happened because the Defaulter for the deployment would've been run against it.                                          
	// This also means that existing would have the default values. When we call specr, we                                      
	// wipe out those default values, and our test fails.                                                                       
                                                                                                                                    
	// Finally, suppose the above call to specr (and the assertions) is commented out.                                          
	// When we fetch the resource for the second time in the below call to CreateOrUpdate,                                      
	// we'll get the complete object, with defaults and all. But then when we call specr,                                       
	// we'll replace the entire Spec (which has the defaults), with deplSpec (which                                             
	// *doesn't* have the defaults), making the call to equality.Semantic.DeepEqual fail                                        
	// on every call, and thus CreateOrUpdate will *ALWAYS* perform an update.                                                  
	op, err = controllerutil.CreateOrUpdate(context.TODO(), c, deploy, specr)                                                   
	By("returning no error")                                                                                                    
	Expect(err).NotTo(HaveOccurred())                                                                                           
                                                                                                                                    
	By("returning OperationResultNone")                                                                                         
	Expect(op).To(BeEquivalentTo(controllerutil.OperationResultNone))                                                           
})                                                                                                                                  

In conclusion, if the mutate function sets a field with a non-zero default to the zero value for that field, then it can never be the case that the resource fetched from the API server is equal to the resource passed to CreateOrUpdate, and it will always perform an update.

@troy0820
Copy link
Member

troy0820 commented Apr 8, 2025

I am still trying to understand what the issue is. If you use the method twice, wouldn't the resourceVersion be different? Because you are always Getting the object from the first time you used it?

Therefore the eventual consistency from what @alvaroaleman is talking about means that the resource is being "updated"?

I would caution with the edge case only because conflating this behavior as inconsistent to change would likely force an update that will break other users.

The tooling itself requires the mutateFn to always work on the object and that seems to make it not equal. Curious to understand more of why the expected behavior of even using this twice would result in a OperationResultNone.

@alvaroaleman
Copy link
Member

I think the issue is that their mutate function resets fields that get defaulted due to golang zero values, which is why there is always a diff. But short of duplicating all possible validations, I don't see how we can do anything about that in c-r

@ian-howell
Copy link
Author

Yes, the defaulting is the problem.

@troy0820's answer made me notice something else though - if there are any values set on obj prior to calling CreateOrUpdate, they might be overwritten by the call to Get.

I agree with @alvaroaleman that c-r shouldn't be responsible for the defaulting problem, but perhaps we could update the documentation? I've created a PR with some potential additions: #3193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants