When checking health of jobs on a specific queue - broken jobs for all queues are included #433

edwilde · 2024-06-07T04:55:12Z

Module version(s) affected

dev-master

Description

I have two queues running on a project, the normal queue which has many tiny jobs which can break from time to time and a large queue with important jobs which shouldn't break.

I added a task to automatically check the health of only the large queue, however I now get reports whenever any job is broken.

In the normal queue, I don't really care too much if the tiny jobs break - they'll be recreated at some point and should be fine. So I don't really need to be notified of this.

I would assume that when a queue is specified for the CheckJobHealthTask that only that queue would be checked for health, not the current behaviour which is to check the specific queue for jobs to be restarted and check all queues for broken jobs.

How to reproduce

Create a job in the normal queue, with status broken
Create a job in the large queue, with status new
Run vendor/bin/sake dev/tasks/CheckJobHealthTask queue=3 to run a health check on the large queue (3) only

Expected

The health check runs with no exception thrown

Actual

The health check throws an exception: Exception: 1 jobs are unhealthy 😬

Possible Solution

I believe the fix should be fairly simple, adding the extra queue restriction to the filter in QueuedJobService::checkJobHealth(), so that:

$brokenJobs = QueuedJobDescriptor::get()->filter('JobStatus', QueuedJob::STATUS_BROKEN);

Becomes:

$brokenJobs = QueuedJobDescriptor::get()
	->filter(
		[
			'JobStatus' => QueuedJob::STATUS_BROKEN,
		],
	 	'JobType' => $queue,
	);

...like the others

Additional Context

No response

Validations

Check that there isn't already an issue that reports the same bug
Double check that your reproduction steps work in a fresh installation of silverstripe/installer (with any code examples you've provided)

The text was updated successfully, but these errors were encountered:

michalkleiner · 2024-06-08T06:17:58Z

Thanks for the report @edwilde. Keen to open a PR with the suggested change?

edwilde · 2024-06-09T20:12:20Z

Yeah, I certainly will when I get chance @michalkleiner. I did a quick look over and it seems the unit tests are a bit lacking in that area too, so it might be longer than just a single line fix.

GuySartorelli added type/bug affects/v5 labels Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When checking health of jobs on a specific queue - broken jobs for all queues are included #433

When checking health of jobs on a specific queue - broken jobs for all queues are included #433

edwilde commented Jun 7, 2024

michalkleiner commented Jun 8, 2024

edwilde commented Jun 9, 2024

When checking health of jobs on a specific queue - broken jobs for all queues are included #433

When checking health of jobs on a specific queue - broken jobs for all queues are included #433

Comments

edwilde commented Jun 7, 2024

Module version(s) affected

Description

How to reproduce

Possible Solution

Additional Context

Validations

michalkleiner commented Jun 8, 2024

edwilde commented Jun 9, 2024