Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No problem indicators on failed workflow input deserialization during queries #933

Open
Drahflow opened this issue Oct 13, 2022 · 2 comments

Comments

@Drahflow
Copy link

Expected Behavior

When attempting to query a workflow (e.g. via temporal UI) and the workflow worker cannot successfully replay the workflow state due to deserialization errors of the input payload of the workflow, there should be some log message about said failure to deserialize.

Actual Behavior

The desired query type is not visible in the UI. Stacktraces (which are also served by queries) are not available without explanation.

Please see #932 for an initial analysis of the cause and some pointers where it goes wrong.

@cretz
Copy link
Member

cretz commented Oct 13, 2022

there should be some log message about said failure to deserialize.

From your PR, it seems your logs are in general error location and obviously we don't want to add logs for all errors now (some are expected/normal). Do you have a specific suggestion of where to add logging? What about workflow and signal argument deserialization errors? (even if no specific suggestion, no worries, we will investigate where to add)

@Drahflow
Copy link
Author

I tried to find a good place, but couldn't. thus the half-baked PR. The problem is from the dispatcher startup perspective, these errors are indistinguishable from "normal" termination with workflow error. And the history replay loop (where it is known that the code has just replayed the WorkflowStart) doesn't seem to have a good way to access the workflow environment where the dispatcher will update the error.

That might be the ideal logic: If the workflow went into an error state after replaying just the StartWorkflow event -> log.
(I think replay errors of later events are covered via the non-determinism check anyway.) No idea whatever about signal payload deserialization problems, we didn't have that yet :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants