-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate singleton lists as embedded objects #387
Conversation
f22cff7
to
017f4f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work @ulucinar. Please find my comments below.
// +kubebuilder:storageversion | ||
{{ end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Would {{- end }}
be more appropriate here? Otherwise, it seems to me that double blank lines will follow.
03260c8
to
0c57170
Compare
d6c4f08
to
9a463d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch mode { | ||
case ToSingletonList: | ||
slices.Sort(paths) | ||
case ToEmbeddedObject: | ||
sort.Slice(paths, func(i, j int) bool { | ||
return paths[i] > paths[j] | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this sorting because of the order of the path processing? I mean, we ensure that conversions are performed correctly even when nested structures are present. For example, in the ToEmbeddedObject case, the deeper structures are processed first. Because if we change the deeper structures before the higher-level structures, it will not affect the conversion of the higher-level structures.
What do you think about whether there is any potential risk during this approach? Do you think it's worth coding this in more conventional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relevant issue here is the fieldpath expressions use indices (e.g., [*]
) for traversing lists. We generate the fieldpath expressions for the singleton lists that have been converted using the expressions that work on the original Terraform topology. And we need to convert from embedded objects to singleton lists (when passing data from the XP layer to the TF layer) and from singleton lists to embedded objects (when passing data from the TF layer to the XP layer) at runtime using these fieldpath expressions. If we do not process the field conversions in the correct order, the expressions may mismatch the temporary topology during a conversion. For example, let's assume a singleton list contains another nested singleton list, both of which we've converted and let's assume the corresponding fieldpath expressions for the conversion targets are a.b[*]
and a.b[*].c[*]
, respectively. When converting from embedded objects to singleton lists, converting a.b
(which is an embedded object in the source) before its children relieves us from having to adjust the fieldpath expressions with the a.b[*].
prefix because once a.b
is converted into a singleton list, then the fieldpath expressions that target its children now match the (converted) topology. Similarly, when converting from singleton lists to embedded objects at runtime (e.g., after reading the TF state), converting a.b[*]
after a.b[*].c[*]
relieves us from having to adjust the fieldpath expressions that target a.b[*]
's children.
This is the motivation of processing the conversions in an order. We could as well adjust the fieldpath expressions to match the topology as we perform the conversions, but this looks simpler to me.
gvk := dst.GetObjectKind().GroupVersionKind() | ||
if err := runtime.DefaultUnstructuredConverter.FromUnstructured(srcMap, dst); err != nil { | ||
return errors.Wrap(err, "cannot convert the map[string]any representation of the source object to the conversion target object") | ||
} | ||
// restore the original GVK for the conversion destination | ||
dst.GetObjectKind().SetGroupVersionKind(gvk) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, we moved this block to IdentityConversion, right? What do you think about a wrong configuration (via Resource.Conversions) that may override this part? Then we will miss the identity copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct. Previously, conversion.RoundTrip
would unconditionally do the identity conversion which copies fields with the same name & path into the target object. But in the scenario where we have converted singleton lists into embedded objects, the source and the target end up having fields with different types (a list and an object) at identical paths, which breaks the identity conversion.
Furthermore, it's possible to have, for instance, a conversion.ManagedConversion
, which performs the identity conversion itself (e.g., by directly copying some metadata
fields from source into the target). We previously did not have this flexibility and conversion.RoundTrip
would always do the identity conversion.
It's now optional and for backwards compatibility, config.DefaultResource
now initializes the config.Resource.Conversions
slice with the identity conversion. In contexts where we don't want the identity conversion, then we need to replace Conversions
slice.
There could in theory be clients which just replace the slice here although they need the identity conversion and will be broken when they consume these changes. If they are overriding the default Conversions
slice, then it means they are not interested in using the defaults. But probably they just assume there are no defaults instead of not using these (or any future) defaults. I consider this as a bad practice. Another complication is that there could be multiple places in the provider code where the converters are added. If multiple sites replace these converters (instead of appending to already existing converters), then again it's probably unintended.
Unfortunately, I think this has roots in not following the principle of encapsulation in the resource configuration API (the configuration data fields are just exported). This requires a self-discipline at the client side.
1543b80
to
83e4f07
Compare
- Terraform configuration blocks, even if they have a MaxItems constraint of 1, are (almost) always generated as lists. We now generate the lists with a MaxItems constraint of 1 as embedded objects in our MR APIs. - This also helps when updating or patching via SSA the (previously list) objects. The merging strategy implemented by SSA requires configuration for associative lists and converting the singleton lists into embedded objects removes the configuration need. - A schema traverser is introduced, which can decouple the Terraform schema traversal logic from the actions (such as code generation, inspection, or singleton-list-to-embedded-object conversion) taken while traversing the schema. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
the generated CRD API version as the storage version. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
…versions between singleton list & embedded object API versions. - Export conversion.Convert to be reused in embedded singleton list webhook API conversions. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
the storage & hub API versions independently. - The default values for both of the storage & hub versions is the version being generated, i.e., Resource.Version. - Replace pipeline.ConversionHubGenerator & pipeline.ConversionSpokeGenerator with a common pipeline.ConversionNodeGenerator implementation - Now the hub generator can also inspect the generated files to regenerate the hub versions according to the latest resource configuration and we have removed the assumption that the hub version is always the latest version generated. - Fix duplicated GKVs issue in zz_register.go. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
…Converter Signed-off-by: Alper Rifat Ulucinar <[email protected]>
…ield paths when the conversion.RoundTrip copies the fields, from src to dst, with the same name. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
- Fix runtime conversion for expanded field paths of length greater than 1. Signed-off-by: Alper Rifat Ulucinar <[email protected]>
- Prevent execution of these pipelines multiple times for each available versions of an API group. - Improve conversion.RoundTrip paved conversion error message Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Signed-off-by: Alper Rifat Ulucinar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ulucinar LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your changes @ulucinar 🙏 LGTM.
Description of your changes
Fixes #136
Terraform configuration blocks, even if they have a
MaxItems
constraint of 1, are (almost) always generated as lists. We now generate the lists with a MaxItems constraint of 1 as embedded objects in our MR APIs.This also helps when updating or patching via SSA these list objects (which are now converted to embedded objects). The merging strategy implemented by SSA requires configuration for associative lists and converting the singleton lists into embedded objects removes the configuration need.
In addition to the changes in the code generation pipelines, this PR also implements the runtime conversion logic needed to convert the CRD's embedded objects back to the singleton lists when passing data to the Terraform stack. We also need to convert the singleton lists we read from the Terraform stack into embedded objects for updating
spec.forProvider
(late-initialization),status.atProvider
, or when updating the connection details secrets.This PR also introduces a schema traverser, which can decouple the Terraform schema traversal logic from the actions (such as code generation, inspection, or singleton-list-to-embedded-object conversion) taken while traversing the schema.
When providers consume these changes, they are forced to generate the singleton lists in their APIs as embedded objects. By default, the generated APIs should not be affected. A provider can opt in the conversion, by specifying the following Provider option in upjet's
config.NewProvider
:This traverser will detect the singleton lists and mark them to be generated as embedded objects. Provider maintainers will also have the flexibility to override the resource configuration generated by this traverser with upjet's resource configuration framework. This implies that the resource configuration overrides are executed after the mutating traversers act on the resource configurations.
Although it's possible to overwrite the existing CRD API versions, in the providers, with the converted APIs, the providers consuming this change can also generate new versions of their CRD APIs with the embedded object schemas. We extend the existing multi-version CRD generation support to allow specifying the storage version and the hub version via
config.Resource.SetCRDStorageVersion
andconfig.Resource.SetCRDHubVersion
, respectively.config.Resource.CRDStorageVersion
andconfig.Resource.CRDHubVersion
can be used to access the configured storage and hub versions, respectively. These have been implemented as functions instead of struct fields to enforce defaulting: If the storage version or the hub version is not specified, then the default is to use the configured API version. This is also the current behavior.If the providers opt to generate the converted APIs as new versions, this PR also implements two common CRD API conversion functions to be invoked, at runtime, by their conversion webhooks. These conversion functions can be registered with the API conversion registry as follows:
The identity converter will handle copying of the same name fields from the conversion source object to the target, ignoring the converted fields (remember that the embedded objects in the new API have the same field names as the singleton lists in the old API but their types differ and thus, they cannot be handled by the identity converter).
The singleton list converter works on the converted fields at any depth and is responsible for the conversions between the singleton lists and the embedded objects.
If the providers use these converters, they can keep their old APIs (e.g.,
v1beta1
) intact and introduce the embedded object schemas in the new API versions (e.g.,v1beta2
). When it encounters an oldv1beta1
API with the singleton lists, the registered conversion webhook will be able to convert it to the newv1beta2
API with the embedded objects seamlessly.I have:
make reviewable
to ensure this PR is ready for review.backport release-x.y
labels to auto-backport this PR if necessary.How has this code been tested
Manually tested via an affected resource (
AccessPolicy. conditionalaccess
, which currently has many singleton lists in its API) incrossplane-contrib/provider-upjet-azuread
.Also validated the same resource by consuming the upjet changes from this PR's feature branch but not configuring the list converter in
crossplane-contrib/provider-upjet-azuread
. This is to make sure that the proposed changes in the code generation pipeline & the upjet runtime are both backwards-compatible.