Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use LaunchMON with PBS instead of Slurm? #64

Open
sloede opened this issue Jan 27, 2023 · 4 comments
Open

Is it possible to use LaunchMON with PBS instead of Slurm? #64

sloede opened this issue Jan 27, 2023 · 4 comments

Comments

@sloede
Copy link

sloede commented Jan 27, 2023

As the title states, we would like to use LaunchMON together with PBS. Ultimately, we are aiming to get Spindle to work, which seems to have LaunchMON as a requirement. Is there precedence/experience for using LaunchMON with PBS and/or is it known that it does not work (or only with limitations)?

@mplegendre
Copy link
Member

(Speaking from the spindle side)

Launchmon is an optional dependency for Spindle, but not a hard requirement. If launchmon supports a RM then Spindle can use launchmon with for that RM.

Alternatively, Spindle could be directly ported to use the RM without launchmon. That's what Spindle on slurm does.
Or, you can modify the RM to call into Spindle's launch API (https://github.com/hpc/Spindle/blob/devel/doc/spindle_launch_README.md), which gives you spindle capabilities directly in your RM. That's what flux does.

@sloede
Copy link
Author

sloede commented Jan 27, 2023

Thanks a lot for the fast response! And thanks for the clarification that launchmon is not strictly necessary to use spindle.

However, the document you referenced is quite intimidating,thus using the API directly seems like a daunting task. Do you know if there is precedence for using spindle either directly or through launchmon with PBS, and, if someone has maybe even written this down somewhere (publicly) 😬?

@mplegendre
Copy link
Member

I don't think there's been any PBS usage. It would needed to be coded up, and I haven't done that nor spoken to anyone about that.

It's not too tough to add Spindle to a RM. The document looks a bit more intimidating because Spindle's last release added a bunch of optional API calls. But the core API is only about five Spindle functions that need calling from certain places in the RM. Old versions of the document capture the core API better: https://github.com/hpc/Spindle/blob/057ace51cef8f3ce47a52a4ca3bcec49d48bb640/doc/spindle_launch_README.md

The flux team got it working after a ~30 minute phone call and a day or two of work: hpc/Spindle#50

@mplegendre
Copy link
Member

Something else I just remembered:

If you're just looking for a one-off Spindle/PBS then there's some quick and dirty hacks in Spindle that can get it running. Specifically, Spindle has a "hostbin" mode where you can provide a script to spindle that interfaces with the RM and provides information Spindle needs like the list of hosts in a job you want to spindle-ize. I think you might also need to modify your application launch line to tell spindle where to insert its bootstrapper.

I honestly haven't run hostbin mode in years. It might need some cobwebs dusted off. And I wouldn't recommend it for any production roll outs. But it is the lowest bar for running spindle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants