-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix-16978][Master] Fix AbstractDelayEvent compare method is incorrect #16980
base: dev
Are you sure you want to change the base?
Conversation
...g/apache/dolphinscheduler/server/master/engine/task/statemachine/TaskFailureStateAction.java
Fixed
Show resolved
Hide resolved
// for now, killEventAction can be fired on failure with retry | ||
// there is no task executor now, shouldn't call workflowExecutionGraph.isTaskExecutionRunnableActive() | ||
// it's better to call super.killedEventAction() direct | ||
if (taskExecutionRunnable.isTaskInstanceCanRetry() | ||
// && workflowExecutionGraph.isTaskExecutionRunnableActive(taskExecutionRunnable) | ||
) { | ||
super.killedEventAction(workflowExecutionRunnable, taskExecutionRunnable, | ||
TaskKilledLifecycleEvent.of(taskExecutionRunnable)); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do this change? or this is only your test code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not test code, before this pr, the kill event always triggered after retry event, so the kill or start(retry) action always fire on submitted state
this code use to finish the workflow's Topology graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I got you.
Please use
if (taskExecutionRunnable.isTaskInstanceCanRetry()
&& workflowExecutionGraph.isTaskExecutionRunnableActive(taskExecutionRunnable)) {
to replace
if (taskExecutionRunnable.isTaskInstanceCanRetry()
// && workflowExecutionGraph.isTaskExecutionRunnableActive(taskExecutionRunnable)
)
You can add a new private method for this, this logic also exist in killedEventAction
private boolean isTaskRetrying(
final IWorkflowExecutionRunnable workflowExecutionRunnable,
final ITaskExecutionRunnable taskExecutionRunnable) {
final IWorkflowExecutionGraph workflowExecutionGraph = taskExecutionRunnable.getWorkflowExecutionGraph();
return taskExecutionRunnable.isTaskInstanceCanRetry()
&& workflowExecutionGraph.isTaskExecutionRunnableActive(taskExecutionRunnable);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add a new private method for this, this logic also exist in killedEventAction
sure.
but i need test workflowExecutionGraph.isTaskExecutionRunnableActive(taskExecutionRunnable)
more for some case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm a bit confused about this:
when task is failed and going to waiting retry, this task was not removed from WorkflowExecutionGraph.activeTaskExecutionRunnable
, activeTaskExecutionRunnable.add()
in isTaskExecutionRunnableActive
will always return false
, this result prevented the execution of if-block.
or WorkflowExecutionGraph.isTaskExecutionRunnableActive
should call activeTaskExecutionRunnable.contains
instead of activeTaskExecutionRunnable.add()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is another bug, we should use contains
rather than add
here. You can fix this in this PR or submib another pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, it's better to add a judge method isTaskRetrying (ITaskExecutionRunnable taskExecutionRunnable)
at IWorkflowExecutionGraph
.
public boolean isTaskExecutionRunnableRetrying(final ITaskExecutionRunnable taskExecutionRunnable) {
if (!taskExecutionRunnable.isTaskInstanceInitialized()) {
return false;
}
final TaskInstance taskInstance = taskExecutionRunnable.getTaskInstance();
return taskInstance.getState() == TaskExecutionStatus.FAILURE && taskExecutionRunnable.isTaskInstanceCanRetry() && isTaskExecutionRunnableActive(taskExecutionRunnable);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok!
BTW, we are missing test cases in |
sure, i'll do that. |
Purpose of the pull request
Fix AbstractDelayEventBus block events issue fix #16978.
Brief change log
Verify this pull request
Stop a workflow with failed task when waiting to retry.
Pull Request Notice
Pull Request Notice
If your pull request contains incompatible change, you should also add it to
docs/docs/en/guide/upgrade/incompatible.md