Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to instantiate named clusters from jobqueue config #604

Closed
wants to merge 4 commits into from

Conversation

alisterburt
Copy link

@alisterburt alisterburt commented Mar 6, 2023

Discussed in #543

This PR allows custom clusters of any type to be specified in the jobqueue yaml

Below is an example config file with a second PBS cluster which uses a different queue called 'gpu'

jobqueue:
  pbs:
    cores: 36
    memory: 100GB
    queue: regular
  custom-pbs-cluster:
    job-scheduling-system: pbs
    queue: gpu

This can be instantiated with JobQueueCluster.from_name()

from dask_jobqueue import JobQueueCluster

cluster = JobQueueCluster.from_name("custom-pbs-cluster")

This also works directly with the standard job scheduling systems

from dask_jobqueue import JobQueueCluster

cluster = JobQueueCluster.from_name("pbs")

The main reasons for this PR are:

  1. to enable instantiating clusters of arbitrary type (SLURM, PBS) from scripts without writing against a specific cluster class
  2. ability to have multiple cluster configurations for the same scheduling system which differ slightly
  3. pave the way for cluster discovery from dask-ctl (which would enable named cluster specification from there)

#543 was closed because the features provided here can be made available through a dask-ctl specific config yaml. I think this PR is still necessary because implementing autodiscovery of dask-jobqueue clusters in dask-ctl requires some way of 'discovering' possible clusters. With this PR, we can simply iterate over the keys in the jobqueue config and instantiate the specified clusters using JobQueueCluster.from_name().

Additionally, the config in dask-jobqueue is less Python specific than the dask-ctl yaml spec - this may be less of a burden for configuration by HPC admins who are not experts in Python.

#95 is similar functionality but implemented at the cluster class level so does not solve point 1 above.

cc @guillaumeeb because you were involved in earlier discussions

Thank you in advance for any time spent reviewing!

@guillaumeeb
Copy link
Member

Thanks @alisterburt for this proposal.

I went through the discussion in dask-contrib/dask-ctl#61, and I'm not sure of the outcome. @jacobtomlinson do you think the change here are necessary or should we wait for a PR in dask-ctl?

@alisterburt
Copy link
Author

Thank you for taking the time to go through the PR/discussion @guillaumeeb!

I think the correct answer here depends on how you think dask-jobqueue users should ideally configure their clusters moving forwards. Is that through a jobqueue config or a dask-ctl config? There is a valid argument against having both, but...

  • I doubt dask-jobqueue will ever disallow config through a jobqueue cluster in favour of dask-ctl completely
  • dask-jobqueue only config minimises the surface area of dask API users have to understand
  • this PR will enable discovery of clusters specified in a jobqueue config within dask-ctl

I see the two PRs (this and the future one in dask-ctl) as orthogonal and complementary unless there is an effort to move to using dask-ctl as the only way to create/manage clusters in the dask ecosystem

@alisterburt
Copy link
Author

fixed the silly test failure but workflow requires re-approval from a maintainer 🙂

Copy link
Member

@guillaumeeb guillaumeeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is all it takes in code modifications, clearly I'm in favor of this!
I'm almost surprised the CLUSTER_CLASSES[job_sheduling_system](**config) actually does the trick! We have consistent naming everywhere in kwargs and yaml entries?

I've left a few comments, and also it would be good to have some documentation about this functionality, could you add some?

Also, I'd like to have @jacobtomlinson approval before merging, at least on the idea!


Named clusters should specify the appropriate job scheduling system under the
'job-scheduling-system' key in the jobqueue config alongside specific keyword
arguments passed to the JobQueue cluster class on initialisation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arguments passed to the JobQueue cluster class on initialisation.
arguments passed to the JobQueueCluster class on initialisation.


def test_jobqueuecluster_from_name_for_existing_scheduling_systems():
for scheduling_system in CLUSTER_CLASSES.keys():
temporary_config = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is setting this temporary config really useful to this test?



def test_jobqueuecluster_from_name_for_custom_cluster():
temporary_config = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also you don't seem to check the custom config. Wasn't that mean to check that htcondor default config wasn't taken into account?

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR I see this PR as stepping too far in the direction of implementing cluster lifecycle management and would prefer to see
the broader community collaborating on dask-ctl rather than building duplicate solutions independently.


I disagree about this being orthogonal to the dask-ctl discussion. From a project scope and overall Dask maintenance perspective this feels like it shouldn't live in dask-jobqueue.

The problem this PR solves is "How can admins preconfigure user environments with some cluster templates?". The discussion in dask-ctl solves the same problem but for everyone.

Note
Dask Gateway also already has a solution for this for HPC. I'm keen to upstream things from gateway into ctl too because gateway has limited scope and use cases.

Note
Dask Labextension also has similar functionality (which inspired the dask-ctl code) for defining a template for the "New cluster" button. The goal is for this to leverage dask-ctl in the future instead.

My goal is to try and push as much cluster lifecycle logic as possible into dask-ctl so that it is all in one place and is easily reusable between subprojects. I dislike the idea of individual deployment projects implementing their own solutions to this problem, and other problems like it, rather than collaborating on a shared tool like dask-ctl.

I see the two PRs (this and the future one in dask-ctl) as orthogonal and complementary unless there is an effort to move to using dask-ctl as the only way to create/manage clusters in the dask ecosystem

The goal of dask-ctl is to be a control plane for all Dask clusters. It wont be the only way to do things but it will be the most consistent and convenient way to do cluster lifecycle things.

I doubt dask-jobqueue will ever disallow config through a jobqueue cluster in favour of dask-ctl completely

I agree and I expect in the future that users can still use the individual libraries directly to "pull the cables and move the flaps" if they want to.

I also think cluster creation will always be simplest by using the library directly. The value in dask-ctl is value add things like discovery, listing, aggregating, templating and single-pane-of-glass views.

dask-jobqueue only config minimises the surface area of dask API users have to understand
this PR will enable discovery of clusters specified in a jobqueue config within dask-ctl

I would push back on this. This PR expands the Dask API here instead of guiding folks to use an existing API. From a user perspective there is still API to learn.

We are on a journey with dask-ctl where hopefully one day it will be upstreamed to distributed or made directly available via the dask namespace like dask.distributed is today. But to get there we need folks to use it rather than working independently.

My question to @alisterburt is what can we do to make dask-ctl better and reduce friction there? For example I totally sympathise with your point about the use of the keywords args and kwargs in the cluster spec. Maybe we want to change that to something easier to understand in a v2 spec? (this is why I versioned the spec, so we can iterate nicely)

@alisterburt
Copy link
Author

alisterburt commented Mar 13, 2023

@guillaumeeb @jacobtomlinson thank you both for the reviews!

@guillaumeeb as there currently is no consensus around whether this PR is wanted I will hold off on updating for now.

@jacobtomlinson I am completely on board with the centralisation of cluster lifecycle management in dask-ctl, one of the main reasons for this PR was to simplify the implementation of dask-ctl discovery in dask-jobqueue in a followup PR 🙂

The problem this PR solves is "How can admins preconfigure user environments with some cluster templates?". The discussion in dask-ctl solves the same problem but for everyone.

The goal of dask-ctl is to be a control plane for all Dask clusters. It wont be the only way to do things but it will be the most consistent and convenient way to do cluster lifecycle things.

I expect in the future that users can still use the individual libraries directly to "pull the cables and move the flaps" if they want to.

I think the main problem this PR solves is actually "how can users preconfigure multiple clusters of the same type without leaving jobqueue?". This is currently not possible using the jobqueue config and adding dask-ctl into the mix is a good (i.e. not ad-hoc) solution. I think not being able to do this is a deficiency of the current jobqueue config that should be solved in jobqueue.

I also think cluster creation will always be simplest by using the library directly. The value in dask-ctl is value add things like discovery, listing, aggregating, templating and single-pane-of-glass views.

My goal is to try and push as much cluster lifecycle logic as possible into dask-ctl so that it is all in one place and is easily reusable between subprojects. I dislike the idea of individual deployment projects implementing their own solutions to this problem, and other problems like it, rather than collaborating on a shared tool like dask-ctl.

We are on a journey with dask-ctl where hopefully one day it will be upstreamed to distributed or made directly available via the dask namespace like dask.distributed is today. But to get there we need folks to use it rather than working independently.

This is exciting! It feels like clearly delineating what dask-ctl and dask-foo should be responsible for (in the ideal case) moving forwards would be useful here. I'm still a bit unsure

My question to @alisterburt is what can we do to make dask-ctl better and reduce friction there? For example I totally sympathise with your point about the use of the keywords args and kwargs in the cluster spec. Maybe we want to change that to something easier to understand in a v2 spec? (this is why I versioned the spec, so we can iterate nicely)

For my own personal use dask-ctl is working wonderfully, I'm happy creating clusters from yaml paths for now and will be happier creating from names once we get dask-contrib/dask-ctl#61 going. As a developer making things for non-python folks most likely working in HPC, I really like the idea of developing against dask-ctl and creating/discovering dask-foo clusters from there.

This is how I imagine the top of programs I will provide to people

from dask_ctl import get_cluster, create_cluster

# cluster_name comes in from the outside world
try:
    cluster = get_cluster(cluster_name)
except NameError:
    cluster = create_cluster(cluster_name)

The friction for my use case comes in when asking users to configure a cluster and they need jobqueue and ctl rather than just jobqueue - I'm optimising to keep the set of things users interact with as minimal as possible

Overall I am happy to go either way in this repo and will PR to dask-ctl soon regardless but I do think there is merit to this including this PR in jobqueue 🙂

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Mar 14, 2023

I think not being able to do this is a deficiency of the current jobqueue config that should be solved in jobqueue.

I think the core of my comments are that I think creating cluster templates is out of scope for dask-{jobqueue,kubernetes,cloudprovider,yarn,etc} and we should be doing this in dask-ctl.

To quote the Zen of Python There should be one– and preferably only one –obvious way to do it..

one of the main reasons for this PR was to simplify the implementation of dask-ctl discovery in dask-jobqueue

I haven't reviewed this PR line by line yet as I wanted to have the high level discussion first. However at a quick glance I don't think this satisfies the dask-ctl expectations for from_name. Two (or more) processes should both be able to call dask_jobqueue.SLURMCluster.from_name("awesomecluster") and all get a pointer to exactly the same cluster. My reading of this PR would be that a new cluster would be spawned each time.

In this PR name seems to be a template name. In a dask-ctl context name is the unique name of a cluster instance.

This is how I imagine the top of programs I will provide to people

That looks like a nice example. In dask-kubernetes we do a check for an existing cluster with that name before trying to create a new one so the code example is further simplified. It would be nice to do this in more places.

from dask_ctl import create_cluster

cluster = create_cluster("/path/to/spec.yaml")  # If 'foo' exists it connects to it, if not it creates it
# /path/to/spec.yaml
version: 1
module: "dask_kubernetes.operator"
class: "KubeCluster"
kwargs:
    name: "foo"

Most people only want a client anyway so a one liner would be

client = create_cluster("/path/to/spec.yaml").get_client()
# They can still access the cluster object at 'client.cluster' if they want to call 'client.cluster.scale(10)' or something

Here's some examples of how you can use KubeCluster.

from dask_kubernetes.operator import KubeCluster

cluster = KubeCluster(name="foo")  # If 'foo' exists it connects to it, if not it creates it (most users want this)

# Users who don't want this can control creation/connection

cluster = KubeCluster.from_name("foo")  # If 'foo' exists it connects to it, if not it raises an exception (this meets dask-ctl expectations)
cluster = KubeCluster(name="foo", create_mode="CONNECT_ONLY")  # Equivalent to 'from_name'

cluster = KubeCluster(name="foo", create_mode="CREATE_ONLY")  # If 'foo' exists it raises an exception, if not it creates it

The friction for my use case comes in when asking users to configure a cluster and they need jobqueue and ctl rather than just jobqueue

I totally understand, ultimately these are just all bits of Dask but we avoid installing them all by default to reduce bloat in the core package. But I understand that separate namespaces can add friction. Do you see this problem being more of an install-time issue or import-time issue? Would making dask-ctl a dependency of dask-jobqueue resolve the install-time friction?

@alisterburt
Copy link
Author

I think the core of my comments are that I think creating cluster templates is out of scope for dask-{jobqueue,kubernetes,cloudprovider,yarn,etc} and we should be doing this in dask-ctl.

To quote the Zen of Python There should be one– and preferably only one –obvious way to do it..

Cool! Thank you for being patient, this is clear guidance I can get behind 🙂 let's close this PR and push for realising the ideal ecosystem then.

Two (or more) processes should both be able to call dask_jobqueue.SLURMCluster.from_name("awesomecluster") and all get a pointer to exactly the same cluster. My reading of this PR would be that a new cluster would be spawned each time.

Your reading is correct and I think my understanding of dask here is lacking, if I had multiple users in a HPC environment what are the advantages/disadvantages of having them connect to the same Client/Cluster vs creating their own? Naively I thought we would want users to run their own for resiliency (other users can't crash my thing) and to avoid overwhelming the task graph if there are multiple heavy users.

Thanks for the dask-kubernetes example - that's clean and flexible!

I understand that separate namespaces can add friction. Do you see this problem being more of an install-time issue or import-time issue? Would making dask-ctl a dependency of dask-jobqueue resolve the install-time friction?

Sure - I think ~all friction would be removed if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config. @guillaumeeb would you/the other maintainers here accept PRs in this direction?

@jacobtomlinson
Copy link
Member

what are the advantages/disadvantages of having them connect to the same Client/Cluster vs creating their own

It's less about multiple users sharing a single cluster and more about multiple processes sharing a cluster (often sequentially, not at the same time). Workflow managers like Airflow have multiple stages in a pipeline which are each launched as a separate process or allocation with various levels of fan-in and fan-out. It's nice to have the first stage in your pipeline create a Dask cluster and each stage after that reuse the Dask resource and the final stage delete the Dask cluster.

It's also very helpful in some failure modes that are not common on HPC. HPC allocations have a wall time, so if you use dask-jobqueue to create a cluster and the whole thing blows up (and the Python process dies without calling the finalizers via a SIGKILL) the Dask cluster will eventually be reaped by the resource manager, not much harm done. Kubernetes and Cloud don't have any concept of wall time so if you lose the Python process that created your cluster the cleanup is very manual and error-prone. Failures there can be costly if you can't quickly identify left-over Dask clusters and quickly delete them.

With the new KubeCluster implementation I mentioned before you can quickly create a new object that points to your cluster and then call cluster.close() to clean up. This is also a core purpose of dask-ctl, being able to run dask cluster list and dask cluster delete foo is very powerful. Both of these commands depend on being able to recreate a cluster manager object to interact with the cluster.

if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config

I would love to see this. What do you think @guillaumeeb?

@alisterburt
Copy link
Author

You're a star @jacobtomlinson - I learned a bunch and the motivation behind centralising this management is now crystal clear, thanks for writing this up!

@guillaumeeb
Copy link
Member

if dask-ctl were bundled with jobqueue/dask itself at install time and the jobqueue documentation used dask-ctl for cluster templates rather than its own config

I would love to see this. What do you think @guillaumeeb?

Well, wow, that's a pretty complete and detailed discussion here. I have to admit I didn't spend a lot of time looking at dask-ctl... So my vision here is not yet broad enough. As first thoughts:

  • I don't mind having dask-ctl bundled with dask-jobqueue, I think you really well justified the ultimate goal there.
  • I'd be really happy to have at the same time some documentation on how to use dask-ctl for dask-jobqueue cluster management here, and why it can be useful.
  • However, I'm currently a bit afraid of the rather than its own config part 🙂. And I'm not sure about all this would imply. Should we discourage the use of the existing dask-jobqueue yaml configuration? Should we drop (part of?) the underlying code at some point?

@jacobtomlinson
Copy link
Member

Should we drop (part of?) the underlying code at some point?

I'm not suggesting removing any of the existing configuration in dask-jobqueue. All the configuration does today is set the default kwargs for the HPCCluster objects which totally makes sense.

My comment was more that the config here shouldn't be expanded to include lifecycle things like cluster templates. Instead we should work together to do this in dask-ctl so that all dask-foo projects can benefit from shared functionality.

@guillaumeeb
Copy link
Member

Ok, perfectly fine for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants