Skip to content

Commit

Permalink
Update deployment documentation
Browse files Browse the repository at this point in the history
This adds code samples to the recommended sections and a bit of context
around the challenges of each approach.
  • Loading branch information
mrocklin committed Feb 1, 2024
1 parent 08341db commit 2bafd17
Showing 1 changed file with 82 additions and 29 deletions.
111 changes: 82 additions & 29 deletions docs/source/deploying.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Deploy Dask Clusters
Local Machine
-------------

You don't need to do any setup to run Dask. Dask will use threads
You can run Dask without any setup. Dask will use threads
on your local machine by default.

.. code-block:: python
Expand All @@ -47,9 +47,9 @@ on your local machine by default.
df = dd.read_csv(...)
df.x.sum().compute() # This uses threads on your local machine
Alternatively, you can set up a fully-featured Dask cluster on your local
machine. This gives you access to multi-process computation and diagnostic
dashboards.
Alternatively, you can set up a fully-featured multi-process Dask cluster on
your local machine. This gives you access to multi-process computation and
diagnostic dashboards.

.. code-block:: python
Expand Down Expand Up @@ -84,59 +84,112 @@ The following resources explain how to set up Dask on a variety of local and dis

Cloud
-----
|Coiled|_ **is recommended for deploying Dask on the cloud.**
Though there are other options you may consider depending on your specific needs:

- `Coiled <https://coiled.io?utm_source=dask-docs&utm_medium=deploying>`_: Commercial Dask deployment option, which handles the creation and management of Dask clusters on cloud computing environments (AWS, GCP, and Azure).
- `Dask Cloud Provider <https://cloudprovider.dask.org/en/latest/>`_: Constructing and managing ephemeral Dask clusters on AWS, DigitalOcean, Google Cloud, Azure, and Hetzner.
- `Dask-Yarn <https://yarn.dask.org>`_: Deploy Dask on YARN clusters, such as are found in traditional Hadoop installations.
Deploying on commercial cloud like AWS, GCP, or Azure is convenient because you can quickly scale out to many machines for just a few minutes, but also challenging because you need to navigate awkward cloud APIs, manage remote software environments with Docker, send data access credentials, make sure that costly resources are cleaned up, etc.. The following solutions help with this process.

- `Coiled (recommended) <https://coiled.io?utm_source=dask-docs&utm_medium=deploying>`_:
this commercial SaaS product handles most of the deployment pain we currently
see. The free tier is generous enough for most individual users so that it
often suffices, even for those who don't want to engage with a commercial
company. The API looks like the following.

.. code-block:: python
import coiled
cluster = coiled.Cluster(
n_workers=100,
region="us-east-2",
worker_memory="16 GiB",
spot_policy="spot_with_fallback",
)
client = cluster.get_client()
- `Dask Cloud Provider <https://cloudprovider.dask.org/en/latest/>`_: a pure and simple OSS solution that sets up Dask workers on cloud VMs, supporting AWS, GCP, Azure, and also other commercial clouds like Hetzner and Digital Ocean.

- `Dask-Yarn <https://yarn.dask.org>`_: deploys Dask on legacy YARN clusters, such as can be set up with AWS EMR or Google Cloud Dataproc

See :doc:`deploying-cloud` for more details.

.. _Coiled: https://coiled.io?utm_source=dask-docs&utm_medium=deploying
.. |Coiled| replace:: **Coiled**
.. |Coiled| replace:: **Coiled**


High Performance Computing
--------------------------
|Dask-Jobqueue|_ **is recommended for deploying Dask on HPC systems.**
Though there are other options you may consider depending on your specific needs:

- `Dask-Jobqueue <https://jobqueue.dask.org>`_: Provides cluster managers for PBS, SLURM, LSF, SGE and other resource managers.
- `Dask-MPI <http://mpi.dask.org/en/latest/>`_: Deploy Dask from within an existing MPI environment.
Dask runs on traditional HPC systems that use a resource manager like SLURM,
PBS, SGE, LSF, or similar systems, and a network file system. It can deploy
either directly through the resource manager or through
``mpirun``/``mpiexec`` and tends to use the NFS to distribute data and
software.

- `Dask-Jobqueue (recommended) <https://jobqueue.dask.org>`_: interfaces directly with the
resource manager (SLURM, PBS, SGE, LSF, and others) to launch many Dask
workers as batch jobs. It generates batch job scripts and submits them
automatically to the user's queue. This approach operates entirely with user
permissions (no IT support required) and enables interactive and adaptive use
on large HPC systems. It looks a little like the following:

.. code-block:: python
from dask_jobqueue import PBSCluster
cluster = PBSCluster( # <-- scheduler started here
cores=24,
memory='100GB',
queue='regular',
account='my-account',
)
cluster.scale(jobs=100)
client = cluster.get_client()
- `Dask-MPI <http://mpi.dask.org/en/latest/>`_: deploys Dask on top of any system that supports MPI using ``mpirun``. It is helpful for batch processing jobs where you want to ensure a fixed and stable number of workers.
- `Dask Gateway for Jobqueue <https://gateway.dask.org/install-jobqueue.html>`_: Multi-tenant, secure clusters. Once configured, users can launch clusters without direct access to the underlying HPC backend.

See :doc:`deploying-hpc` for more details.

.. _Dask-Jobqueue: https://jobqueue.dask.org
.. |Dask-Jobqueue| replace:: **Dask-Jobqueue**
.. |Dask-Jobqueue| replace:: **Dask-Jobqueue**

Kubernetes
----------
|Dask-Kubernetes|_ **is recommended for deploying Dask on Kubernetes.**
Though there are other options you may consider depending on your specific needs:

- `Dask Kubernetes Operator <https://kubernetes.dask.org/en/latest/operator.html>`_: For native Kubernetes integration for fast moving or ephemeral deployments.
Dask runs natively on Kubernetes clusters. This is a convenient choice when a
company already has dedicated Kubernetes infrastructure set up for running
other services. When running Dask on Kubernetes users should also have a plan
to distribute software environments (probably with Docker) user credentials,
quota management, etc.. In larger companies this is often handled by other
Kubernetes services.

- `Dask Kubernetes Operator (recommended)
<https://kubernetes.dask.org/en/latest/operator.html>`_: The Dask Kubernetes
Operator makes the most sense for fast moving or ephemeral deployments. It
is the most Kubernetes-native solution, and should be comfortable for K8s
enthusiasts. It looks a little like this:

.. code-block:: python
from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(
name="my-dask-cluster",
image='ghcr.io/dask/dask:latest',
resources={"requests": {"memory": "2Gi"}, "limits": {"memory": "64Gi"}},
)
cluster.scale(10)
client = cluster.get_client()
- `Dask Gateway for Kubernetes <https://gateway.dask.org/install-kube.html>`_: Multi-tenant, secure clusters. Once configured, users can launch clusters without direct access to the underlying Kubernetes backend.
- `Single Cluster Helm Chart <https://artifacthub.io/packages/helm/dask/dask>`_: Single Dask cluster and (optionally) Jupyter on deployed with Helm.

See :doc:`deploying-kubernetes` for more details.

.. _Dask-Kubernetes: https://kubernetes.dask.org/en/latest/operator.html
.. |Dask-Kubernetes| replace:: **Dask Kubernetes Operator**
.. |Dask-Kubernetes| replace:: **Dask Kubernetes Operator**

.. _managed-cluster-solutions:

Managed Solutions
-----------------
|Coiled|_ **is recommended for deploying managed Dask clusters.**
Though there are other options you may consider depending on your specific needs:

- `Coiled <https://coiled.io?utm_source=dask-docs&utm_medium=deploying>`_: Manages the creation and management of Dask clusters on cloud computing environments (AWS, GCP, and Azure).
- `Domino Data Lab <https://www.dominodatalab.com/>`_: Lets users create Dask clusters in a hosted platform.
- `Saturn Cloud <https://saturncloud.io/>`_: Lets users create Dask clusters in a hosted platform or within their own AWS accounts.


Manual deployments (not recommended)
------------------------------------

Expand Down

0 comments on commit 2bafd17

Please sign in to comment.