Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[💡SUG] Automatic dataset and dataloader save path #864

Open
damicoedoardo opened this issue Jul 2, 2021 · 1 comment
Open

[💡SUG] Automatic dataset and dataloader save path #864

damicoedoardo opened this issue Jul 2, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@damicoedoardo
Copy link

For my understanding there are two different level of dataset preprocessing,
Level 1 filtering and preprocessing operation: you can save a dataset after this step calling dataset.save()
Level 2 dataset train/(val)/test split: you can save the different dataloaders with save_split_dataloaders()

In the first case you can specify the dir where you want to save the dataset
In the second case you can not specify the dir where you want to save, instead the dataloaders are saved under config['checkpoint_dir']

Is it possible to unify the behaviour of the two saving processes?

I would propose to use the data_path config parameter to store the three different level of preprocessed data that can be stored under three different dir:

  1. f'{data_path}/atomic': all the atomic files which are now under dataset
  2. f'{data_path}/preprocessed': All the dataset which have been preprocessed (level 1 mentioned above)
    e) f'{data_path}/{model_name}/splitted': All the dataloaders for an algorithm or category of algorithms (level 2 mentioned above)

I see the proposed structure more intuitive and easy to use.

Thanks

@damicoedoardo damicoedoardo added the enhancement New feature or request label Jul 2, 2021
@damicoedoardo damicoedoardo reopened this Jul 2, 2021
@2017pxy 2017pxy self-assigned this Jul 5, 2021
@2017pxy
Copy link
Member

2017pxy commented Jul 5, 2021

@damicoedoardo Thanks for your advice, we will consider it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants