-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Local and remote caching behaviour should not differ #5837
Comments
In case anyone observes another situation where the behaviour differs, feel free to add to this issue. |
#take |
@luckyarthur , please, let me know which bugs you're going to be fixing, ok? |
I'm trying to fix this one now, sorry it takes longer time, cause I'm new to this whole system, I'm trying to locate where the logic of cache for none return value task at backend. |
@luckyarthur , for sure, I just meant which one of the two issues described in the issue you were planning to tackle. |
I'm working on both of them, since they are presented in one issue |
Describe the bug
As a user, I would expect that the caching behaviour is the same when executing a workflow in a cluster vs executing it locally as a python script.
In practice, there are situations where the behaviour differs:
Expected behavior
Caching of tasks without return values:
Locally, this task can be cached while in a cluster execution it can't be. Flyteconsole says "Caching was disabled for this execution".
As a user, I have a strong preference for being able to cache tasks without a return value as tasks can have side effects (like e.g. storing a resulting metric in a metadata store) which don't need a return value but are still supposed to be cached. We have multiple tasks in our code base that have a dummy return value only to allow the task to be cached.
Cache misses upon schema changes:
When executing this workflow, adding
b: int
toFoo
as an example of a schema change, and executing again, there is an expected cache miss in the remote execution but an unexpected cache hit in the local execution. The local behaviour needs to be adapted.Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: