Use granular merge strategy for singleton lists #380

JonathanO · 2024-03-15T16:21:40Z

Description of your changes

Many TF providers nest blocks using a ListType with MaxItems=1, which upjet converts into a list containing a map. The default server side apply merge strategy for lists is atomic, resulting in conflicts between the provider and composition functions. This can be fixed on a case-by-case basis in the providers (crossplane-contrib/provider-upjet-gcp#457), but it seems likely that all of these should be treated as if they were maps (which default to granular merge.)

While it would be great to fix this by flattening the unnecessary array, that would force an API change on all the providers.

This PR instead adjusts the ServerSideApplyMergeStrategies in the config to use ListTypeMap with a defaulted key for a singleton list. It works by altering the config, rather than during creation of Fields, to avoid duplicating the implementation used for ServerSideApplyMergeStrategies.

It also moves the existing server side apply marker defaulting for scalar sets and maps to work in the same way for consistency. I'm not sure that it's necessary to apply the granular strategy in the TypeMap case, as I think that's the k8s default, but I've left it in place to avoid additional changes to the output.

I have:

Read and followed Upjet's contribution process.
Run make reviewable to ensure this PR is ready for review.
Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Generated updated GCP provider, which was run in Kind.

…ment support for singleton lists. Move applying SSA merge default config to provider. Use correct Computed value for list merge keys Signed-off-by: Jonathan Oddy <[email protected]>

ulucinar · 2024-03-22T09:58:34Z

Hi @JonathanO,
Thank you for bringing this up. We had previously considered the idea of generating, by the default, an SSA merge strategy of map for singleton lists here:

Add server-side apply merge strategy markers #301: Please see this comment and my reply to it.
Support merging with an array item in composition patches crossplane#3335: I had originally classified this issue as an enhancement and not a bug. My main motivation in doing so was that for certain associative lists, it makes more sense to update the list elements atomically, and for some others, granularly. This depends on the relationships between the list object's fields and I think this is one of the reasons, in the context of SSA, we have a merge strategy of atomic in addition to the map strategy.

My concerns with defaulting an SSA merge strategy of map for all object lists with the MaxItems = 1 constraint are as follows:

As you've also mentioned in the PR description, we would like to convert singleton object lists into embedded objects. A relevant issue is here. We've (relatively) recently introduced support for multi-version CRDs and reusable chained API conversion functions in upjet, which will allow us to hide the breaking changes when we convert singleton lists into embedded objects (and there are some other challenges in implementing this). Converting singleton lists into embedded objects is currently on our roadmap. A relevant roadmap item is generating the CRD validation rules that will enforce the cardinality constraints on the object lists including the singleton ones.
There might already exist an "index set" (i.e., object fields that constitute the list map keys). Adding an artificial index field when there already exists a natural index set (possibly with a cardinality greater than 1) is a normalization issue for singleton object lists. We had done some observations across the providers crossplane-contrib/provider-upjet-{aws,azure,gcp}, and there are object lists with already existing natural index sets in these providers. We could implement some heuristics like if a field with name name exists in the list element's schema, do not inject an artificial index field but rather declare the index set as {"name"}. The available resource configuration API for specifying a map strategy for object lists allows this. However, there are also problems associated with such heuristics as for a proper SSA merge implementation, the index set must uniquely identify distinct objects. From a theoretical point of view, a heuristic like this one is flawed as it could potentially violate the index set uniqueness property. Of course, the MaxLength = 1 constraint on the list size in our context relaxes the index set uniqueness constraint (as there can only be one element in the list) but we currently don't enforce the list size constraints at the API server. It's currently only enforced at the Terraform layer. As mentioned above, generating the proper CRD validation rules is on our roadmap.
What happens if a (singleton) object list's MaxItems = 1 constraint is relaxed and it's no longer a singleton? For certain object lists, this could be semantically viable and for some others, this does not make sense. Currently, when configuring the SSA merge strategy for object lists, we can make judgements in this regard. For example, if the singleton object list we are configuring is a defaultNodeConfig, I feel more comfortable because it's the default node configuration, probably it doesn't make much sense to have multiple defaults in the future (they would be more like presets, not a default). This question is relevant in other contexts too, such as converting singleton lists into embedded objects.
I also don't like the fact that we are injecting fields into our API as it complicates the API and the implementation. This idea had previously been considered as a violation of the high-fidelity principal of the Crossplane resource model here but a client-side merge implementation like one I suggested in that issue never took off for various reasons, and we had to address v0.44.0 - Cluster eks.aws.upbound.io unable to select subnetIds crossplane-contrib/provider-upjet-aws#975, and so opted to address it on the server-side via the SSA merge strategies.

So, my proposal would be:

We implement the CRD validation rules that will enforce the size constraints from the underlying Terraform schema. The Terraform layer already enforces them but we had better enforce first at the API server level in alignment with the Terraform schema.
We convert singleton lists into embedded objects so that we no longer need to configure the list map strategy and the associated index sets for them. Having introduced the support for multi-version CRDs has unblocked this.

What do you think @JonathanO?

JonathanO · 2024-03-22T13:23:27Z

We convert singleton lists into embedded objects so that we no longer need to configure the list map strategy and the associated index sets for them. Having introduced the support for multi-version CRDs has unblocked this.

That would be the ideal solution, but I raised this PR as a possible stop-gap solution until such a time that converting to embedded objects is implemented. If converting to embedded objects is likely to be available in the very near future then I agree that this PR is not necessary.

The atomic merge behaviour creates problems, and today the fix is to add ListTypeMap merge configuration to individual resources in providers. That's quite a frustrating user experience (because you discover the incorrect merge strategy only after running into problems), and will also need to be undone once the conversion to embedded objects is available.

Covering your other concerns:

There might already exist an "index set"

As you go on to mention, this is not a problem for the case where MaxLength = 1, since there should only be one entry.

What happens if a (singleton) object list's MaxItems = 1 constraint is relaxed

It's an API change and would need a new CRD version (which will be the same as if the singleton list had been converted to an embedded object.) Fortunately this seems unlikely to happen often given how these blocks are used in the Terraform.

I also don't like the fact that we are injecting fields into our API

I agree, but at the moment we're raising PRs against providers to add these injected fields on a case-by-case basis - they're slowly ending up (inconsistently) in provider APIs already.

jeanduplessis · 2024-03-27T18:16:46Z

In today's Upjet SIG meeting, we discussed not going ahead with this PR with @JonathanO.

The team from Upbound will implement a solution where singleton lists (lists of only one (required or optional) object) are converted to embedded objects. We hope to roll out this solution in the next few weeks.

Inject merge strategies into the config rather than the output, imple…

eac88d3

…ment support for singleton lists. Move applying SSA merge default config to provider. Use correct Computed value for list merge keys Signed-off-by: Jonathan Oddy <[email protected]>

JonathanO force-pushed the jono-singleton-list-granular-merge branch from 9380ba1 to eac88d3 Compare March 18, 2024 13:02

JonathanO marked this pull request as ready for review March 18, 2024 13:07

JonathanO requested review from ulucinar, sergenyalcin, erhancagirici and mergenci as code owners March 18, 2024 13:07

jeanduplessis closed this Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use granular merge strategy for singleton lists #380

Use granular merge strategy for singleton lists #380

JonathanO commented Mar 15, 2024 •

edited

Loading

ulucinar commented Mar 22, 2024 •

edited

Loading

JonathanO commented Mar 22, 2024

jeanduplessis commented Mar 27, 2024

Use granular merge strategy for singleton lists #380

Use granular merge strategy for singleton lists #380

Conversation

JonathanO commented Mar 15, 2024 • edited Loading

Description of your changes

How has this code been tested

ulucinar commented Mar 22, 2024 • edited Loading

JonathanO commented Mar 22, 2024

jeanduplessis commented Mar 27, 2024

JonathanO commented Mar 15, 2024 •

edited

Loading

ulucinar commented Mar 22, 2024 •

edited

Loading