You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A restart happened when large scale jobs are running. After that, some of the jobs queued in the in-progress queue (which depends on the worker ID => return fmt.Sprintf("%s:%s:inprogress", redisKeyJobs(namespace, jobName), poolID) will not be restored.
A restart will recreate the worker pools and generate new workers with new UUIDs. And it seems that the dead pool reaper thread only check the workers of current worker pool and the previous one will be discarded. However, some of the in-progress queues are relying on those workers. It results in that some of the in-progress jobs cannot be requeued.
as most of the time, after the restart, the previous pool is gone, but there are still some in-progress queues with in-progress jobs, those jobs are becoming unavailable anymore.
The text was updated successfully, but these errors were encountered:
A restart happened when large scale jobs are running. After that, some of the jobs queued in the in-progress queue (which depends on the worker ID =>
return fmt.Sprintf("%s:%s:inprogress", redisKeyJobs(namespace, jobName), poolID
) will not be restored.A restart will recreate the worker pools and generate new workers with new UUIDs. And it seems that the dead pool reaper thread only check the workers of current worker pool and the previous one will be discarded. However, some of the in-progress queues are relying on those workers. It results in that some of the in-progress jobs cannot be requeued.
The
Reap
flow seems like the following one:Find the dead pools first,
In the dead pool finding process,
as most of the time, after the restart, the previous pool is gone, but there are still some in-progress queues with in-progress jobs, those jobs are becoming unavailable anymore.
The text was updated successfully, but these errors were encountered: