Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running tasks #32

Open
arsendovlatyan opened this issue Aug 12, 2019 · 4 comments
Open

Long running tasks #32

arsendovlatyan opened this issue Aug 12, 2019 · 4 comments
Assignees

Comments

@arsendovlatyan
Copy link
Member

Sometimes, tasks can hang-on in running state, will be good if we can set max-running time per task or per bulk argument, send notification once it will be passed and give an option to delete or mark as failed.

I think we can have 'last modified attribute, which can be used to implement this functionality and also provide useful info.

@zzhovo
Copy link
Collaborator

zzhovo commented Oct 18, 2019

After running each simple task or a task with bulk arguments we check the duration and exit the process, so there's no need to define a duration for each task or bulk argument.

The problem is that when the execution time outs, the bulk arguments and the task itself stays marked as running.

Solution 1:
Before starting running the tasks we can mark all the previous running tasks as failed, because there should not be 2 running queues and the reason we still have running tasks is that they have failed because of the timeout. The problem here is that we can't log the exact time the task has been failed, but it can be fixed by logging the start time before running the task, and at the end modify the log to add post execution info. Right now we log all the info after the task is run with a single DB query.

Solution 2:
Create a timer that will trigger a callback before the timeout, and the callback will mark the current task and its bulk arguments as failed. The problem here is that we should use a PHP extension, and it is hard to define the moment of the "before the timeout".

In the case we want to track the time the task has been changed by admin as the "last modified time", we can hold the info in the same table. In the case we want to track the time that is related to the execution and is not related to the task itself, then we should log it.

@arsendovlatyan
Copy link
Member Author

@zzhovo what will happen if execution of a bulk task takes more than allowed max execution time? I think it goes to 'running' state waits for next execution request, so, even if you will schedule a new task with higher priority, it still will execute the 'running' one, right ?

@zzhovo
Copy link
Collaborator

zzhovo commented Oct 21, 2019

@arsendovlatyan when a task takes more than allowed execution time the request ends with fatal error, and the status of that task stays "running". The next cycle starts running the other tasks with status "registered" or "in progress" and doesn't do anything with the "running" tasks, because they are not "running" but "failed". So what I suggest is not creating a separate task to fix the running tasks, but just mark them as "failed" before starting to execute the other tasks.

@arsendovlatyan
Copy link
Member Author

During our discussion today with @eriktad and @armdev17 we decided:

  • change 'last modified" term with "last run", so, it will identify last time, when system started task execution.
  • keep the logic that's already developerd by @eriktad e.g. determine if task is in running state, but "last run" was x seconds ago and x is twice bigger than BTM's max execution time, mark 'running' arguments as failed, add log and move task "in progress"

However, I think the condition "last run" was x seconds ago and x is twice bigger than BTM's max execution time is a bit tricky, why twice ? Maybe just "last run" was x seconds ago and x is bigger than BTM's max execution time or "last run" was x seconds ago and x is bigger than PHP max execution time ?

Our target with this change is to keep tasks alive after disaster, so, we don't need to wait twice more, because we can have other tasks coming and they won't be merged into running task, right ?

If so, we can remove twice, the thing is that compare with PHP max execution time or BTM's ? I think PHP's gives us more stability, if task is running longer than PHP max execution time, for 99% something is wrong with that task, using BTM's gives us more control, this option becomes more strict and critical, so, you expected your tasks to be completed during that period for give bulk size, otherwise, it could be terminated and will fail.

I think in our case and for advance users second option (using BTM's one) is good, for others, we can recommend cron interval, execution time and PHP max execution time to much each other, usually, PHP max execution time is 30 seconds (in our case it's 600 or so), so, if someone will set BTM execution time below PHP max exec. time, we can give them a warning saying something like "tasks can be interrupted before PHP max execution time will be passed", we don't need to do this now, just thoughts for a record. For now, we can just remove the twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants