-
-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[16.0] [ENH] queue_job: identity_key enhancements #546
[16.0] [ENH] queue_job: identity_key enhancements #546
Conversation
richard-willdooit
commented
Jun 16, 2023
- In production, a job which is waiting dependencies or which has started, but not completed, should not be repeated if the identity_key matches.
- In tests, the mock queue handler is now enhanced to allow better mimicking of the identity_key blocks from production.
- In tests, the mock queue handler now clears the enqueued jobs after performing them, to better reproduce what a production environment would do.
1. In production, a job which is waiting dependencies or which has started, but not completed, should not be repeated if the identity_key matches. 2. In tests, the mock queue handler is now enhanced to allow better mimicking of the identity_key blocks from production. 3. In tests, the mock queue handler now clears the enqueued jobs after performing them, to better reproduce what a production environment would do.
Hi @guewen, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
Dependencies and trap_jobs
were the last (very) big things I worked on this addon, and I'm thrilled to see they are used and being improved!
@@ -300,7 +300,7 @@ def job_record_with_same_identity_key(self): | |||
.search( | |||
[ | |||
("identity_key", "=", self.identity_key), | |||
("state", "in", [PENDING, ENQUEUED]), | |||
("state", "in", [WAIT_DEPENDENCIES, PENDING, ENQUEUED, STARTED]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, did you actually use identity keys on a graph of jobs?
When I was working on the dependencies feature, I had a lot of thoughts and no clear outcome of how it should be handled. A job that would be skipped because it already exists in another graph could make the whole (new) graph incoherent, that's why I ended up checking if all the identity keys match.
About the addition of STARTED
, I'm uncertain. Depending on the use case, we should or should not include.
As part of my current work, I use Sidekiq daily. They have a similar feature, but you can set a per-job parameter unique_until
with options : start
(that would mean up to ENQUEUED
here) or success
(that would be up to STARTED
here).
Think about this use case: a job refreshes a cache. Data have changed, we create a pending job. Data change again, no new job because the job is still pending. Job starts. Data change while the job is running. In this very case, we'd like to enqueue a new job otherwise the cache will be outdated.
I reckon that both cases are valid, but I fear adding this state in the domain may, silently and in subtle ways, existing behaviors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I did not use it with a graph.
To give it context - a flag switch on res_company for a custom scenario, where a lot of data was involved, caused an extremely long process involving write outs and read ins - so we felt a background job would help the UX. So, it is always one job.
The "fag" to indicate the state will not be available until after the job is complete - so we were using ideniity key to ensure it does not get double run.
If we do not include "started", then a second click and job could be launched while he first one is already executing... and if the jobs ran in parallel, we'd be in strife. So we were hoping identity key would provide that mutual exclusion.
So yes, maybe both cases are valid ... I was going to make a new option for "mutual exclusivity" - or "until finished"... but did not want to change the architecture so much. But we really need started to be included in our checks, because enqueued is likely to only be for a few seconds, and running is likely to be a very very long time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you want to explore this direction, I do not think it is a large change: we don't need a new argument even in with_delay as all jobs for the same method will have the same option. It could be a new field on queue.job.function
with the 2 options, then the job instance can have access to it through self.job_options
and adapt the domain accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth to note: the identity key cannot be a 100% safe way to prevent duplicate jobs : if 2 transaction create the same job concurrently before commiting, there will be a duplicate. To prevent this, we would need a postgres unique constraint and I preferred not to, to prevent having transactions rollbacked because of this.
So it's better if anyway jobs are idempotent and verify during execution if they have still have to do something (can be coupled with a lock or advisory lock).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see where you are coming from.
Another option I was thinking about, was the option to, for certain jobs, not allow them to be requeued from the queue. In our scenario, we kicked off the job from a button. But if the job fails, and then the data changes in such a way that the button is no longer applicable, then the job has lost applicability, too - and should not be requeued - I have done the right thing and made sure our method is autonomous in its own right, and checks that the method is still applicable, but I did wonder if there was a place for special jobs which should be tried once, and once only, and could not be requeued even if they failed... Ultimately, the main changes I wanted were to introduce tests in our module that replicated the behaviour of our process not allowing the double queuing of the process. Oh to give even more context to how we have used it: If I call write to 2 res company records with 4 values, then 3 values are written to both records, and 2 jobs are enqueued to update the other value on the 2 records. It works well, but tests were quite important to show that the behaviour of this was as expected, and then I decided to extend to count the jobs queued to show that double queuing wold not happen for the same record - which it did not in production, but the mock test object incorrectly indicated it had. |
If you would like me to remove the states from the PR, and just push the test changes, I can do so - or you can modify my PR as you see fit. We would likely then work from my fork for our environment. Let me know. |
I think you may remove the |
@richard-willdooit thanks for this nice work :) |
@richard-willdooit one more thing: when you get back to this, please remove the odoo version from the commit. Is not needed and it makes little sense when we fwd/bkport changes. Thanks! |
@richard-willdooit gentle ping: any plan to move this fwd? |