Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor groovy script to include batching. #1008

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 55 additions & 15 deletions doc/ongoing_operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,34 +81,74 @@ Perform action on a set of jobs
-------------------------------

Sometimes you want to do bulk actions like disable or delete all jobs of a specific distro or just one target.
We recommend running a Groovy scripts using the script console.
We recommend running a Groovy script using the script console.

You can change the distro letter or add or remove prefixes from the list in order to reduce the scope of targeted jobs.

Some operations take time and an HTTP timeout creates performance issues with dangling script runs.
To prevent them the example script uses a ``BATCH_SIZE`` constant to control how many jobs to change before returning.
You can leave the default, fairly conservative batch size or increase that constant steadily until results are no longer instantaneous and then adjusting back down.
Comment on lines +88 to +90

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, does it process jobs even if they're processed/disabled in the previous iteration?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fair to say that in this context, "processing" means changing. That's what seems to take significant time and causes the script invocation to time out.

Simply inspecting all of the jobs can happen in every invocation. The continue in the loop for each should avoid incrementing the counter if the operation would have been a no-op, so "processing" be 1-to-1 with "changing" here.

Copy link
Contributor Author

@nuclearsandwich nuclearsandwich Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! @cottsay is correct here that iterating over jobs does not take much time at all but performing operations on them (enable, disable, delete) does.

So each subsequent run of the script would iterate over a growing number of (previously processed jobs) but checking the predicate is not very time intensive compared to actually performing an operation, so it does not have much of an observable effect on the return time of the script.

I resisted the urge to do even more programming and create predicate,operation pairs and setting another top-level variable so that we could actually add the operation-predicate to the filter closure that's passed to getItems and thus only jobs that meet the predicate would be iterated over.

Running the script repeatedly until there are no longer any remaining jobs to process will work as long as non-destructive changes, like enabling or disabling jobs, are properly skipped.

The following Groovy script is a good starting point for various actions:

.. code-block:: groovy

import hudson.model.Cause
DISTRO_LETTER = "F"
PREFIXES = ["ci_", "dev_", "pr_", "rel", "src_", "bin_"].collect({pre -> DISTRO_LETTER + pre})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be overkill to add a DISTRO_LETTER constant one line before its only use but I like making the bits that are meant to be adjusted very obvious rather than just hard-coding the letter into the map/collect that generates all of the prefixes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as an FYI, if we wanted to get just a bit fancy, we could do this at the top:

DISTRO_NAME = "foxy"
DISTRO_LETTER = DISTRO_NAME[0..0].toUpperCase()
PREFIXES = [DISTRO_NAME]
PREFIXES += ["ci_", "dev_", "pr_", "rel", "src_", "bin_"].collect({pre -> DISTRO_LETTER + pre})   

That way we'll pick up the jobs that are prefixed with the DISTRO_NAME as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is also missing the doc jobs, so it should also be added to the list above.


for (p in Jenkins.instance.allItems) {
if (
p.name.startsWith("PREFIX1__") ||
p.name.startsWith("PREFIX2__") ||
... ||
p.name.startsWith("PREFIXn__"))
{
println(p.name)
def starts_with_any_prefix(name) {
for (prefix in PREFIXES) {
if (name.startsWith(prefix)) {
return true
}
}
return false
}

// p.disable()
// p.enable()

// p.scheduleBuild(new Cause.UserIdCause())
/* Some operations take time and an http timeout
* creates performance issues with dangling script
* runs.
* To prevent them use a batch size that returns
* fairly instant results and run the script multiple
* times
*/
Comment on lines +110 to +116
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hoisted this comment up into the documentation above as well. I am torn between leaving it in the script and cutting it since it's documented above.

BATCH_SIZE = 100

// p.delete()
count = 0
for (job in Jenkins.get().getItems({j -> starts_with_any_prefix(j.name)}))
{
if (count >= BATCH_SIZE)
{
println("Reached ${BATCH_SIZE} limit before processing ${job.name}.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see someone omitting the print of processed job names, which is otherwise the only feedback when processing batches that you aren't processing the same 100 jobs over again (such as if there's a mismatch between the predicate and operation) so I added this print as an indicator of where we stopped.

If I was more bored I might be tempted to implement some kind of cursor pattern.

break
}

/* Disable a job if it is not already disabled */
// if (job.isDisabled()) { continue }
// job.disable()

/* Enable a job if it is currently disabled */
// if (!job.isDisabled()) { continue }
// job.enable()

/* Delete a job! This action is irreversable! */
// job.delete()

nuclearsandwich marked this conversation as resolved.
Show resolved Hide resolved
println(job.name)

/* Increase count for batch processing. */
count++
}

if (count <= BATCH_SIZE)
{
println("Completed execution of the last batch.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always previously just run the script until it returns 0 results but this way the script positively reports that it's processed the last batch (based on the fact that the batch is smaller than the max batch size).

It is possible if the number of jobs to process is precisely divisible by the batch size that this will print only after an empty batch, but it should still print.

}

This script will print only the matched job names.
You can uncomment any of the actions to disable, enable, trigger or delete these projects.
You can uncomment any of the actions to disable, enable, or delete these projects.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the instructions for triggering builds using this script because the _trigger-jobs jobs should cover those use cases now and there is not an entirely reasonable way to track that state using the batch setup.


To run a Groovy script:

Expand Down
Loading