Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: add support for moved fixed bugzilla bugs to verified #161

Merged

Conversation

AlexNPavel
Copy link
Contributor

This PR adds a new step that gets run when a release gets accepted.

With the addition of the /bugzilla cc-qa command to the bugzilla hook plugin for prow, developers can request a review from the QA contact specified in Bugzilla. The QA contact can review the code and verify that a new test exists in the bugfix PR that fails if the bug was not properly fixed. In this case, the QA contact can comment /lgtm or leave an approving review on a repo that had ReviewActsAsLGTM set up for the lgtm plugin.

This allows us to automatically verify bugs in the release-controller. When a release gets accepted, the release-controller can generate a list of bugs that were fixed in the release. It can then find the GitHub PR associated with the bug and, if a QA contact has LGTM'd the PR, the bug can be updated to the VERIFIED state.

This PR does all the steps up to actually updating the bug state. For now, it will simply print whether or not the bug would be updated to VERIFIED or not. This allows us to check that it is properly working before configuring it to make actual changes in bugzilla.

/cc @smarterclayton @stevekuznetsov

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 3, 2020
Copy link
Contributor

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @smarterclayton

Copy link
Contributor

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected frequency of execution and total throughput on the GitHub calls?

@AlexNPavel
Copy link
Contributor Author

What is the expected frequency of execution and total throughput on the GitHub calls?

The frequency is 1 comment list request and 1 review list request for bugs fixed in new releases (unless a PR has more than 100 comments or 100 reviews; then the ghClient would paginate every 100 comments/reviews). Most nightlies seems to have ~1 fixed bug. Full releases may have more (4.4.0-rc4 had 19 fixed bugs from 4.4.0-rc3, 4.4.0-rc6 had 24 fixed bugs from 4.4.0-rc4). Overall, I think we'll probably make <100 requests per hour.

I'm not sure about throughput. Since the frequency of these calls will be quite low, I don't expect us to get throttled, and getting comments and reviews should be quite quick because of that (the main throughput issue we tend to face with GitHub throughput is ratelimit abuse throttling for calls that are frequent and need heavy pagination (like ListCollaborators in hook), which won't be an issue here).

@@ -547,6 +547,26 @@ func (c *Controller) syncAccepted(release *Release) error {
}
}
}
// If bugzilla bugs were fixed by a PR approved by their QA contact, move bug to verified
if c.bugzillaVerifier != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have gotten lost in the previous PR, but this can't and should not belong here. You need to move this to a side sync loop like audit (which is just a separate main controller loop att the top level) that gets notifications of changes to image streams but has a completely different work queue and lifecycle.

With the current code structure, if any of this fails it will never get run again, and the go routines here actually make that worse because we could run syncAccepted fifty times in a second.

Your new loop at the top level is "ensure that every new accepted release has successfully marked all bugs fixed between those two levels as verified at least once, and then in the future it can be a no-op". The two parts that are still missing are the "no-op" part - if you've completed successfully you need some indicator that is so (probably for now we'll add that as a verification field on the actual release tag) - and the control flow that doesn't ever run two coroutines at the same time.

While audit.go is more structured around running jobs, you can start with its flow by running the "verify bugs" function in its place, and if everything completes successfully annotating the release tag with an agreed on value (for now you can test with an arbitrary name and then when I see the code we can agree on what the actual value and name will be). Then you skip any release that has that "already verified successfully", and retry any that don't have it.

Also, it looks like there's a logical gap here that you need to only be running this on image streams that are nightlies (CIs are considered insufficient to close bugs), so you'll need to add an attribute/field/struct on the Release configuration that allows only those inputs to be selected. That's probably going to be adding a new struct to "ReleasePublish" which is an optional *VerifyBugs that will hold any configuration we need. The presence of the struct will be required in your audit-like-loop to execute this code.

If you'd like to set up time on Monday to discuss this, poke me in slack and I'll clear some space on the calendar.

@AlexNPavel
Copy link
Contributor Author

Updated. I am slightly unsure of when it makes sense to add to the bugzillaQueueKeys, as we only need to run when a release is accepted. I currently only added it in the if check for t.Annotations[releaseAnnotationConfig].

One more thing that I am unsure of is whether I am getting the proper current and previous tags for a release correctly. I repurposed some of the code used for the web UI for this, but there may be a better and more correct way to handle that.

@AlexNPavel AlexNPavel marked this pull request as ready for review April 7, 2020 22:20
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 7, 2020
}
var errs []error
// get latest accepted tag
tag := acceptedTags[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in a disaster recovery scenario you're going to verify the tags in a newer release than they really went in. Are you sure you don't want to scan the list to find the last one you verified (if any) and verify one release at a time (i.e. find oldest N, then diff N and N+1, and then next time around scan the next, etc)? And if you find no N, start at the oldest and walk forward? Knowing exactly what release it went into is probably valuable.

It would be good to restructure the "find both versions" into a function that takes a release stream as input and returns the two and then add a unit test for those cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we need a function to "find both versions". Shouldn't it be fine to just loop through acceptedTags until we find the last tag that we verified?

Copy link
Contributor

@smarterclayton smarterclayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structure looks ok, i want to see a better algorithm for walking the accepted tags so that we pin the correct release that the code changed in.

@AlexNPavel AlexNPavel force-pushed the bugzilla-qe-verify branch 2 times, most recently from a5521f4 to f95ec70 Compare April 17, 2020 00:29
@smarterclayton
Copy link
Contributor

Accepted tags walking looks good, a few more things and this should be ready to try out.

@smarterclayton
Copy link
Contributor

Ok, this is ready for a spin. Please squash and I'll tag. How will you be testing the rollout?

This commit simply prints what would be changed in the logs; support
for actual changes to bugzilla states will be added once this is
deemed to work correctly in cluster.
@AlexNPavel AlexNPavel force-pushed the bugzilla-qe-verify branch from 77aac5a to 496fb91 Compare May 14, 2020 19:02
@smarterclayton
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 15, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AlexNPavel, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2020
@openshift-merge-robot openshift-merge-robot merged commit bc55053 into openshift:master May 15, 2020
@AlexNPavel AlexNPavel deleted the bugzilla-qe-verify branch May 15, 2020 17:16
hongkailiu pushed a commit to hongkailiu/release-controller that referenced this pull request Sep 28, 2023
Updating the default value appropriately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants