Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation directories structure with multi-fidelity #136

Open
AwePhD opened this issue Aug 21, 2024 · 3 comments
Open

Documentation directories structure with multi-fidelity #136

AwePhD opened this issue Aug 21, 2024 · 3 comments

Comments

@AwePhD
Copy link

AwePhD commented Aug 21, 2024

Hi,

I have a question about the choice of the argument previous_pipeline_directory in run_pipeline. I browsed the code and it seems that the optimizer is responsible to get the previous trial, since it's the Optimizer's responsibility to sample trials.

Although, my question is why the argument is not has_previous_fidelity_trial with type bool, or something alike?

I do not know how you manage the workers and the multiprocessing for distributed HPO. So maybe, in some situations, the directory of the previous (fidelity) trial of the same config does not have the same directory of the current one? Or maybe there is a more profound reason that I am not aware.

Note that I am not an HPO practitioner, so my understanding of NePS and PriorBand is fairly limited. I just want to apply HPO on a deep learning model for my research.

The question is more about a sanity check for me, that I understood correctly the documentation about multi-fidelity. The most related piece of documentation that I found is this subsection and the multi fidelity page. Maybe a dedicated didactic page on multi fidelity might be good? The two examples are rich and simple, which is very good. But it might be a bit rough to grasp from a DL perspective, namely not familiar with the multi-fidelity HPO (SH, HP, PB ...). Or maybe it's just/only my personal lack of understanding.

Best,
Mathias.

@AwePhD
Copy link
Author

AwePhD commented Aug 21, 2024

I wrongly understood the multi fidelity directories.

When NePS is using multi fidelity, it creates folder like config_{number_config}_{number_fidelity}. So pipeline_directory and previouspipeline_directoryare always different. Therefore,previous_pipeline_directory` is intuitive.

I rename the issue to have a better documentation of the directories structure when performing multi fidelity. I think it might be clearer. In my opinion, it is hard to predict that the directories have this structure beforehand. I think an illustration with some paragraphs might guide the new user in a relevant way? Although, the current documentation is straightforward to see that two run_pipeline calls have different directories. It's just a bit vague to me.

@AwePhD AwePhD changed the title Why previous_pipeline_directory? Documentation directories structure in multi-fidelity Aug 21, 2024
@AwePhD AwePhD changed the title Documentation directories structure in multi-fidelity Documentation directories structure with multi-fidelity Aug 21, 2024
@eddiebergman
Copy link
Contributor

Sorry for the delay in response. Not sure why my notifications for this library are disabled -_- Honestly appreciate the feedback and we'll try to get back to you sooner!

Glad you understood it in the end and yes your interpretation is correct. The main reason to have it in different folders is lost to time but it does make logging of configurations and results much easier to post-process, which is how the library originally was benchmarked. It also helps a bit with paths for file locking (how the parallelism works with arbitrary number of workers), preventing some edge cases.

Thanks for the issue and we'll keep it on the todo-list of things to do. Right now, a lot of the internals are being revamped to make it more performant, usable and lean. One thing that will be revisited is how we handle multi-fidelity. I imagine we'll likely keep the same folder structure and we can document it as so once it's done, including the specifics of the previous pipeline directory.


Some extras:

We'd like to explore many-fidelity soon, such as not just scaling epochsx but also something like depth/width.
One benefit of the current pipeline directory approach (as opposed to re-using the directory) is that in a many-fidelity setup, we may ask the user to load a model from an arbitrary checkpoint, and the {config_id}_{fidelity} naming scheme no longer makes sense.

@AwePhD
Copy link
Author

AwePhD commented Sep 2, 2024

Great thanks for the feedback, I was not sure about the relevance of my issues. Keep up with the good work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants