Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated workflow sweeper to consider a new config MaxPostponeDurationSeconds for in progress tasks #335

Merged
merged 4 commits into from
Dec 20, 2024

Conversation

danmiller192
Copy link
Contributor

@danmiller192 danmiller192 commented Dec 12, 2024

Pull Request type

  • Bugfix
  • Feature
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • WHOSUSING.md
  • Other (please describe):

NOTE: Please remember to run ./gradlew spotlessApply to fix any format violations.

Changes in this PR

Preface: Currently when the sweeper processes a workflow and the task is in progress the sweeper picks up the task response timeout and queues a message on the decider queue using this time to calculate when to next process the workflow. On occasion when the sweeper and task completion events happen out of order this can have the effect of leaving workflows with completed tasks in limbo for long periods. With their tasks not being scheduled until that timeout is reached.

This fix pr addresses this by adding a config value MaxPostponeDurationSeconds to allow for tasks that may have a long and varied response time to be processed by the sweeper on a minimum schedule.

Describe alternative implementation you have considered

I have looked at ways to push the workflow to the decider but due to the asynchronous nature this can on occasion still fall out of step.

Updated workflow sweeper to use the workflowOffsetTimeout rather than the task timeout when the task is in progress.
@danmiller192 danmiller192 changed the title Workflow postpone duration fix Updated workflow sweeper to consider a new config MaxPostponeDurationSeconds for in progress tasks Dec 12, 2024
@danmiller192 danmiller192 force-pushed the workflow_postpone_duration_fix branch from e8c4c14 to a5a8f17 Compare December 13, 2024 21:41
@v1r3n v1r3n merged commit e0780c3 into conductor-oss:main Dec 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants