Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CEL rules extension #269

Closed
wants to merge 4 commits into from
Closed

Conversation

lsviben
Copy link
Contributor

@lsviben lsviben commented Sep 1, 2023

Description of your changes

Disclaimer ⚠️
I am not entirely happy with this solution and am wondering if we should merge this. Putting it out for discussion and explaining my thought process below

Intro

In #237 we made all non-init fields optional, so that they
can be set either in spec.forProvider or spec.initProvider. For top level fields, we replaced the "required" validation with
CEL rules. Such as:

// +kubebuilder:validation:XValidation:rule="!('*' in self.managementPolicies || 'Create' in self.managementPolicies || 'Update' in self.managementPolicies) || has(self.forProvider.title) || has(self.initProvider.title)",message="title is a required parameter"

But we didn't have a solution for the nested fields. So the purpose of this PR is to add CEL validation for nested fields that should be required.

This proved to be tricky due to 2 factors:

Starting this task I had expected to generate something like:

... has(self.forProvider.someMap.nestedField) || ...
OR
... has(self.forProvider.someList[*].nestedField) || ...

But I was very wrong 😰 .

Problem explanation

Firstly, we generate almost every nested struct as an array. Even though the resource expects and needs only one element of that array. I think its due to how terraform schemas are translated into our types, as they mark all of the nested structs as lists of nested structs. I didn't go too deep into this as it's highly unlikely that we would make some changes there and break our API.

Example from upbound/provider-gcp Cluster:

	// Per-cluster configuration of Node Auto-Provisioning with Cluster Autoscaler to
	// automatically adjust the size of the cluster and create/delete node pools based
	// on the current needs of the cluster's workload. See the
	// guide to using Node Auto-Provisioning
	// for more details. Structure is documented below.
	// +kubebuilder:validation:Optional
	ClusterAutoscaling []ClusterAutoscalingParameters `json:"clusterAutoscaling,omitempty" tf:"cluster_autoscaling,omitempty"`
	
...
	type ClusterAutoscalingParameters struct {

	// Contains defaults for a node pool created by NAP. A subset of fields also apply to
	// GKE Autopilot clusters.
	// Structure is documented below.
	// +kubebuilder:validation:Optional
	AutoProvisioningDefaults []AutoProvisioningDefaultsParameters `json:"autoProvisioningDefaults,omitempty" tf:"auto_provisioning_defaults,omitempty"`

	// Whether node auto-provisioning is enabled. Must be supplied for GKE Standard clusters, true is implied
	// for autopilot clusters. Resource limits for cpu and memory must be defined to enable node auto-provisioning for GKE Standard.
	// +kubebuilder:validation:Optional
	Enabled *bool `json:"enabled,omitempty" tf:"enabled,omitempty"`
...

Secondly, it is how CEL works:

  • we need to be mindful of the scope
  • wildcards don't work

Scope

For the scope, it basically means that as we want to reach both forProvider and initProvider, the CEL rule has to be defined on spec. We cannot define a rule in a nested field as it wont have access to initProvider and vice-versa. Otherwise this would be very handy and we wouldn't have a problem with field selection as we could just set the rule on the nested struct in question.

Field selection

On the other hand, we have the problem of how to actually get through to the field we want to check if it is set. There are multiple ways.

Lets imagine we have the following resource:

spec:
  forProvider:
    someList:
    - reqField: "I am set"
      nonReqField: "I just chill here"

The CEL rule can look like this if we want to check just if the first element exists. For most cases this is ok as we expect only one element. For those that are truly lists we miss

has(self.forProvider.someList[0].reqField) || has(self.initProvider.someList[0].reqField)

CEL also provides an all, exists, exists_one macros that could be helpful:

self.forProvider.someList.all(e, has(e.reqField)) || self.initProvider.someList.all(e, has(e.reqField)) 

The problem here is that the check will not work well with true lists, as we could have some elements in initProvider, and some in forProvider, which is narrow edge case, but still worth to keep in mind. If we use exists we come back to a similar solution as using [0].
Example where all wont work well.

spec:
  initProvider:
    someList:
    - additionalField: "just an init field"
    - reqField: "I am set"
      nonReqField: "I just chill here"
  forProvider:
    someList:
    - reqField: "I am set"
      nonReqField: "I just chill here"
    - additionalField: "my required field is set in initProvider"

So lets say that we use the solution with [0]:

has(self.forProvider.someList[0].reqField) || has(self.initProvider.someList[0].reqField)

What if someList is optional? The field reqField is only required if someList actually exists. With the above rule, we would force users to use someList. Thats why we could add:

!has(self.forProvider.someList) || has(self.forProvider.someList[0].reqField) || has(self.initProvider.someList[0].reqField)

And as resources have required fields few levels deep, we can get things like this(example is from GCP Provider AccessLevel):

!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || !has(self.forProvider.basic)
                || !has(self.forProvider.basic[0].conditions) || !has(self.forProvider.basic[0].conditions[0].devicePolicy)
                || !has(self.forProvider.basic[0].conditions[0].devicePolicy[0].osConstraints)
                || has(self.forProvider.basic[0].conditions[0].devicePolicy[0].osConstraints[0].osType)
                || has(self.initProvider.basic[0].conditions[0].devicePolicy[0].osConstraints[0].osType)

which becomes unreadable, but works.

As I said in the beginning, not sure if we should include this. Maybe somebody will see a better solution. The current PR implements the above.

Fixes #239

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Tested with https://github.com/upbound/provider-gcp/. Ran make generate with my upjet. Created the /apis/accesscontextmanager/v1beta1/zz_accesslevel_types.go CRD in a local kind cluster and tested with a few examples:

Resulting CEL rules:

x-kubernetes-validations:
            - message: basic[0].conditions[0].devicePolicy[0].osConstraints[0].osType
                is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || !has(self.forProvider.basic)
                || !has(self.forProvider.basic[0].conditions) || !has(self.forProvider.basic[0].conditions[0].devicePolicy)
                || !has(self.forProvider.basic[0].conditions[0].devicePolicy[0].osConstraints)
                || has(self.forProvider.basic[0].conditions[0].devicePolicy[0].osConstraints[0].osType)
                || has(self.initProvider.basic[0].conditions[0].devicePolicy[0].osConstraints[0].osType)'
            - message: basic[0].conditions is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || !has(self.forProvider.basic)
                || has(self.forProvider.basic[0].conditions) || has(self.initProvider.basic[0].conditions)'
            - message: custom[0].expr[0].expression is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || !has(self.forProvider.custom)
                || !has(self.forProvider.custom[0].expr) || has(self.forProvider.custom[0].expr[0].expression)
                || has(self.initProvider.custom[0].expr[0].expression)'
            - message: custom[0].expr is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || !has(self.forProvider.custom)
                || has(self.forProvider.custom[0].expr) || has(self.initProvider.custom[0].expr)'
            - message: name is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || has(self.forProvider.name)
                || has(self.initProvider.name)'
            - message: parent is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || has(self.forProvider.parent)
                || has(self.initProvider.parent)'
            - message: title is a required parameter
              rule: '!(''*'' in self.managementPolicies || ''Create'' in self.managementPolicies
                || ''Update'' in self.managementPolicies) || has(self.forProvider.ti

Minimal passing validation:

apiVersion: accesscontextmanager.gcp.upbound.io/v1beta1
kind: AccessLevel
metadata:
  name: access-level-service-account
spec:
  forProvider:
    name: name
    parent: parent
    title: title

Result: PASSES
Description: contains the 3 required top level fields

Minimal not passing validation

apiVersion: accesscontextmanager.gcp.upbound.io/v1beta1
kind: AccessLevel
metadata:
  name: access-level-service-account
spec:
  forProvider:
    name: name
    parent: parent
    title: title

Result: FAILS
Description:

The AccessLevel "access-level-service-account" is invalid: spec: Invalid value: "object": no such key: initProvider evaluating rule: name is a required parameter

Nested validation - one level

apiVersion: accesscontextmanager.gcp.upbound.io/v1beta1
kind: AccessLevel
metadata:
  name: access-level-service-account
spec:
  forProvider:
    basic:
    - conditions:
      - devicePolicy:
        - requireScreenLock: false
    name: name
    parent: parent
    title: title

Result: PASSES
Description: sets just basic[0].conditions[0].devicePolicy[0]. Does not go further into osConstraints[]

Signed-off-by: lsviben <[email protected]>
@lsviben
Copy link
Contributor Author

lsviben commented Sep 14, 2023

Discussed this internally with @turkenh and there is another approach that we could consider, but it has some tradeoffs.

The idea is to just scope the CEL rule on the nested struct itself, this way we wouldnt need to check if all the previous arrays are filled out or not. In practice that would make:

From this

!has(self.forProvider.someList) || has(self.forProvider.someList[0].reqField) || has(self.initProvider.someList[0].reqField)

To this:

has(self.reqField)

Which means that we actually dont need to use CEL rules for nested structs at all and we can just keep them required.

The tradeoff is that in this case, we will need to resign from merging initProvider to forProvider nested structs. Instead we would just append them. Because each struct will need to be valid by itself, we cant have a required field in initProvider and another in forProvider, and then merge them as that wont pass validation.

An example for that (autoscaler), would be now invalid as the scalingConfig required fields are desiredSize, maxSize and minSize:

  initProvider:
    tags:
      someTag: tag
    forceUpdateVersion: false
    scalingConfig:
      - desiredSize: 1
  forProvider:
    clusterNameRef:
      name: sample-eks-cluster
    nodeRoleArnRef:
      name: sample-node-role
    region: us-west-1
    scalingConfig:
      - maxSize: 4
        minSize: 1

Instead we would need to set all of them in either spec.forProvider or spec.initProvider:

  initProvider:
    tags:
      someTag: tag
    forceUpdateVersion: false
    scalingConfig:
      - desiredSize: 1
      - maxSize: 4
      - minSize: 1
  forProvider:
    clusterNameRef:
      name: sample-eks-cluster
    nodeRoleArnRef:
      name: sample-node-role
    region: us-west-1

We could already see that this could be an issue if we would like to change just maxSize and minSize during the resource lifecycle. So we are not going with this solution for now.

All in all, our conclusion was that for now we dont have a good solution for the nested structs validation, and that for now it is not pressing to have that, so we could consider dropping this for now until we need it or think of a better solution.

@turkenh
Copy link
Member

turkenh commented Sep 15, 2023

All in all, our conclusion was that for now we dont have a good solution for the nested structs validation, and that for now it is not pressing to have that, so we could consider dropping this for now until we need it or think of a better solution.

I agree.
I will close this PR (for now?) and deprioritize the corresponding issue (i.e. drop it from the scope of Beta graduation). /cc @jeanduplessis @ulucinar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend CEL rules to nested structs
2 participants