Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-24588: use new workload controller with preconditions #247

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RomanBednar
Copy link
Contributor

@RomanBednar RomanBednar commented Jul 24, 2024

ClusterCSIDriver condition when precondition fails:

  - lastTransitionTime: "2024-07-24T11:13:45Z"
    message: |
      dummy precondition 3 failed
    reason: PreconditionNotFulfilled
    status: "True"
    type: DeploymentDegraded

Controllers can be configured with csi-operator to have multiple preconditions, errors are then aggregated in ClusterCSIDriver:

- lastTransitionTime: "2024-07-24T11:13:45Z"
    message: |
      dummy error 2
      dummy error 3
    reason: PreconditionNotFulfilled
    status: "True"
    type: DeploymentDegraded

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 24, 2024
Copy link
Contributor

openshift-ci bot commented Jul 24, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

openshift-ci bot commented Jul 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RomanBednar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 24, 2024
Comment on lines 138 to 139
type Precondition func(ctx context.Context, c *clients.Clients) (bool, error)
type PreconditionFunc func(context.Context) (bool, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there are two functions that do the same? IMO only PreconditionFunc should exist and perhaps be a type defined somewhere in the workload controller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember, probably wanted to keep it separate but there can be just one - changed.

Comment on lines 118 to 128
func (o *OperatorControllerConfig) AddPreconditions(c *clients.Clients, builders ...Precondition) {
for _, builder := range builders {
f := func(ctx context.Context) (bool, error) {
return builder(ctx, c)
}
o.Preconditions = append(o.Preconditions, f)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just add preconditions (better PreconditionFuncs) to o.Preconditions without any closure in between.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changed.

Comment on lines 205 to 208
required, err := c.getDeployment(opSpec)
if err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syncDeleting already has required in its parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overlooked, removing.

return nil, err
}

return required, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The controller conditions will be calculated from required, which is a Deployment loaded from yaml manifests, which is odd.

Currently, I am not sure what would be the right condition. Please add a TODO comment, we can sort it out later.

Personally, I think no conditions at all would be the best. I.e. the controller should remove Available / Progressing, an operator has to add their own condition in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, dealing with conditions in operator makes sense. Adding a TODO comment for now.

Comment on lines 171 to 174
deployment, err = c.syncManaged(ctx, required, opStatus, syncContext)
if err != nil {
errors = append(errors, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in else branch of the previous if. Otherwise the operator deletes the Deployment and then just re-creates it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, fixed.

@RomanBednar RomanBednar changed the title DRAFT: New deployment controller OCPBUGS-24588: use new workload controller with preconditions Oct 1, 2024
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Oct 1, 2024
@openshift-ci-robot
Copy link

@RomanBednar: This pull request references Jira Issue OCPBUGS-24588, which is invalid:

  • expected the bug to target the "4.18.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

ClusterCSIDriver condition when precondition fails:

 - lastTransitionTime: "2024-07-24T11:13:45Z"
   message: |
     dummy precondition 3 failed
   reason: PreconditionNotFulfilled
   status: "True"
   type: DeploymentDegraded

Controllers can be configured with csi-operator to have multiple preconditions, errors are then aggregated in ClusterCSIDriver:

- lastTransitionTime: "2024-07-24T11:13:45Z"
   message: |
     dummy error 2
     dummy error 3
   reason: PreconditionNotFulfilled
   status: "True"
   type: DeploymentDegraded

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Oct 1, 2024
@RomanBednar
Copy link
Contributor Author

/hold

openshift/library-go#1815 has to be merged first followed by library-go bump here

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 1, 2024
@RomanBednar
Copy link
Contributor Author

@jsafrane Moved library-go parts to separate PR and addressed all comments. If we can finish this part I think up next is to attempt workloads controller refactor so it can use daemon set as well and after testing this for a while migrate all operators to the new controller.

@RomanBednar RomanBednar marked this pull request as ready for review October 2, 2024 08:12
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 2, 2024
@RomanBednar
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 2, 2024
@openshift-ci-robot
Copy link

@RomanBednar: This pull request references Jira Issue OCPBUGS-24588, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.0) matches configured target version for branch (4.18.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Nov 9, 2024

@RomanBednar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-index-aws-efs-csi-driver-operator-bundle 04ab0d7 link true /test ci-index-aws-efs-csi-driver-operator-bundle
ci/prow/unit 04ab0d7 link true /test unit
ci/prow/smb-operator-e2e-extended 04ab0d7 link false /test smb-operator-e2e-extended
ci/prow/aws-efs-operator-e2e 04ab0d7 link true /test aws-efs-operator-e2e
ci/prow/e2e-azure-ovn-upgrade 04ab0d7 link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-azure 04ab0d7 link true /test e2e-azure
ci/prow/e2e-azure-file-nfs-csi 04ab0d7 link true /test e2e-azure-file-nfs-csi
ci/prow/e2e-aws-ovn-upgrade 04ab0d7 link true /test e2e-aws-ovn-upgrade
ci/prow/smb-operator-e2e 04ab0d7 link false /test smb-operator-e2e
ci/prow/hypershift-aws-e2e-external 04ab0d7 link true /test hypershift-aws-e2e-external
ci/prow/security 04ab0d7 link false /test security
ci/prow/verify 04ab0d7 link true /test verify
ci/prow/images 04ab0d7 link true /test images
ci/prow/e2e-azure-csi-extended 04ab0d7 link false /test e2e-azure-csi-extended
ci/prow/e2e-azure-file-csi 04ab0d7 link true /test e2e-azure-file-csi
ci/prow/aws-efs-operator-e2e-extended 04ab0d7 link false /test aws-efs-operator-e2e-extended
ci/prow/e2e-azurestack-csi 04ab0d7 link false /test e2e-azurestack-csi
ci/prow/ci-index-smb-csi-driver-operator-bundle 04ab0d7 link true /test ci-index-smb-csi-driver-operator-bundle
ci/prow/e2e-azure-csi 04ab0d7 link true /test e2e-azure-csi
ci/prow/e2e-azure-file-csi-extended 04ab0d7 link false /test e2e-azure-file-csi-extended
ci/prow/e2e-aws-csi-extended 04ab0d7 link false /test e2e-aws-csi-extended
ci/prow/e2e-aws-csi 04ab0d7 link true /test e2e-aws-csi
ci/prow/e2e-openstack-cinder-csi 04ab0d7 link true /test e2e-openstack-cinder-csi
ci/prow/e2e-openstack-manila-csi 04ab0d7 link true /test e2e-openstack-manila-csi

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants