Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset load / save method cannot take arguments #4346

Open
noklam opened this issue Nov 22, 2024 · 4 comments
Open

Dataset load / save method cannot take arguments #4346

noklam opened this issue Nov 22, 2024 · 4 comments
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@noklam
Copy link
Contributor

noklam commented Nov 22, 2024

Description

from kedro.io.core import AbstractDataset

class MyDataset(AbstractDataset):
    def save(self): ...
    def _describe(self): ...
    def load(self, extra_arg): print(extra_arg)


my_dataset = MyDataset()
my_dataset.load(123)
{
	"name": "TypeError",
	"message": "MyDataset.load() takes 1 positional argument but 2 were given",
	"stack": "---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 10
      6     def load(self, extra_arg): print(extra_arg)
      9 my_dataset = MyDataset()
---> 10 my_dataset.load(123)

TypeError: MyDataset.load() takes 1 positional argument but 2 were given"
}

Context

Usually dataset use class attribute, i.e. self.load_args for passing argument to load method. This is fine when using the full Kedro framework with YAML, but it's not convenient when developing interactively. For example I want to do dataset.save(filepath=xyz) to override the settings.

Expected Result

Whether or not the load/save argument should be available for interactive use is a separate topic, if I create the method that takes an argument, it should work. In addition, the stacktrace does not give any useful information which is confusing.

Your Environment

  • Kedro version used (pip show kedro or kedro -V): 0.19.9
  • Python version used (python -V): 3.11
  • Operating system and version: MacOS
@noklam noklam changed the title <Title> Dataset load / save method cannot take arguments Nov 22, 2024
@datajoely
Copy link
Contributor

Yeah this is a no brainer

@noklam noklam added the Issue: Bug Report 🐞 Bug that needs to be fixed label Nov 22, 2024
@deepyaman
Copy link
Member

This seems more like a feature request than a bug? I don't think this has ever been supported.

Yeah this is a no brainer

Not sure I understand why. By passing args that are a property of the dataset to the save or load method, you're breaking the abstraction Kedro provides.

@datajoely
Copy link
Contributor

Agree on the labelling, but aren't you extending rather than breaking this?

@noklam
Copy link
Contributor Author

noklam commented Nov 25, 2024

This seems more like a feature request than a bug? I don't think this has ever been supported.

I haven't tested, but I think this is always supported. The way that framework invoke dataset is usually via something similar:

result = dataset.load()

So framework is not aware of any extra argument load can take, but it does not prevent load method takes extra argument in interactive flow. It is confusing because the error message does not give me any help. I am able to get around with it by actually using _load instead of load.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
Status: No status
Development

No branches or pull requests

3 participants