Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Local and remote caching behaviour should not differ #5837

Open
2 of 4 tasks
fg91 opened this issue Oct 10, 2024 · 6 comments
Open
2 of 4 tasks

[BUG] Local and remote caching behaviour should not differ #5837

fg91 opened this issue Oct 10, 2024 · 6 comments
Assignees
Labels
bug Something isn't working flytekit FlyteKit Python related issue

Comments

@fg91
Copy link
Member

fg91 commented Oct 10, 2024

Describe the bug

As a user, I would expect that the caching behaviour is the same when executing a workflow in a cluster vs executing it locally as a python script.

In practice, there are situations where the behaviour differs:

Expected behavior

  • Caching of tasks without return values:

    @task(cache=True, cache_version="1.0")
    def foo() -> None:
        print("Foo")

    Locally, this task can be cached while in a cluster execution it can't be. Flyteconsole says "Caching was disabled for this execution".

    As a user, I have a strong preference for being able to cache tasks without a return value as tasks can have side effects (like e.g. storing a resulting metric in a metadata store) which don't need a return value but are still supposed to be cached. We have multiple tasks in our code base that have a dummy return value only to allow the task to be cached.

  • Cache misses upon schema changes:

    from dataclasses import dataclass
    from dataclasses_json import dataclass_json
    from flytekit import task, workflow
    
    
    @dataclass_json
    @dataclass
    class Foo:
        a: int
        # b: int
    
    
    @task(cache=True, cache_version="1.0")
    def t1() -> Foo:
        print("Foo")
        return Foo(a=42)  #, b=42)
    
    @workflow
    def wf():
        t1()
    
    
    if __name__ == "__main__":
        wf()

    When executing this workflow, adding b: int to Foo as an example of a schema change, and executing again, there is an expected cache miss in the remote execution but an unexpected cache hit in the local execution. The local behaviour needs to be adapted.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@fg91 fg91 added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Oct 10, 2024
@fg91
Copy link
Member Author

fg91 commented Oct 10, 2024

In case anyone observes another situation where the behaviour differs, feel free to add to this issue.

@eapolinario eapolinario added the flytekit FlyteKit Python related issue label Oct 17, 2024
@eapolinario eapolinario self-assigned this Oct 17, 2024
@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Oct 17, 2024
@popojk popojk removed their assignment Nov 1, 2024
@luckyarthur
Copy link

#take

@eapolinario
Copy link
Contributor

@luckyarthur , please, let me know which bugs you're going to be fixing, ok?

@luckyarthur
Copy link

@luckyarthur , please, let me know which bugs you're going to be fixing, ok?

I'm trying to fix this one now, sorry it takes longer time, cause I'm new to this whole system, I'm trying to locate where the logic of cache for none return value task at backend.

@eapolinario
Copy link
Contributor

@luckyarthur , for sure, I just meant which one of the two issues described in the issue you were planning to tackle.

@luckyarthur
Copy link

@luckyarthur , for sure, I just meant which one of the two issues described in the issue you were planning to tackle.

I'm working on both of them, since they are presented in one issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

4 participants