Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch job support for Pangeo hubs #15

Open
rabernat opened this issue Oct 11, 2022 · 3 comments
Open

Batch job support for Pangeo hubs #15

rabernat opened this issue Oct 11, 2022 · 3 comments

Comments

@rabernat
Copy link

Context

Pangeo hub users often want to put a long-running job into the background. Instead, on our hubs today, they have to essentially keep a notebook open all the time for these long running jobs. This leads to awkward and inefficient workflows, such as postdocs not being able to close their laptops for days.

cc @paigem, @jbusecke

Proposal

I propose we install kbatch on the Pangeo hubs. We discussed this idea quite a while back, but I can't find any record of that conversation.

Updates and actions

No response

@jbusecke
Copy link

This would be an awesome addition to the hub for pretty mich every project I am involved. Enthusiastic 👍, and happy to test!

@jmunroe
Copy link

jmunroe commented Oct 11, 2022

I agree that batch submission is an important class of computing for many problems. And it does appear that kbatch would be one way of providing that functionality.

But, from what I am reading, kbatch is a relatively simple wrapper for submitting a job to a kubernetes cluster that lacks features compared to something like slurm or prefect. I think it makes sense to use for automatic running of a notebook in a lights-out way, but is its intended use case only for relatively short jobs? Is kbatch still appropriate for running multi-day batch computing?

While it is of course no worries to experiment with kbatch if that solves people's immediate problems, are there other scheduling systems that we should be considered? I am worrying about things like setting limits on long-running jobs, resources allocation between users, reporting and logging. I think other hubs also are discussing "batch" computing as well so I'll going to add this a feature request on our weekly 2i2c Product and Engineering meeting to see if there are other options that should be considered.

@rabernat
Copy link
Author

All good questions @jmunroe. @yuvipanda worked quite a bit on kbatch and decided it was the sweet spot in terms of complexity. But happy to align whatever tools 2i2c wants to support here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants