diff --git a/docs/source/deploying.rst b/docs/source/deploying.rst
index c55ac37cc06..c4db371784c 100644
--- a/docs/source/deploying.rst
+++ b/docs/source/deploying.rst
@@ -1,96 +1,75 @@
Deploy Dask Clusters
====================
-.. toctree::
- :maxdepth: 1
- :hidden:
-
- deploying-python.rst
- deploying-cli.rst
- deploying-ssh.rst
- deploying-docker.rst
- deploying-hpc.rst
- deploying-kubernetes.rst
- deploying-cloud.rst
- deploying-python-advanced.rst
- deployment-considerations.rst
-
-The ``dask.distributed`` scheduler works well on a single machine and scales to many machines
-in a cluster. We recommend using ``dask.distributed`` clusters at all scales for the following
-reasons:
-
-1. It provides access to asynchronous APIs, notably :doc:`Futures <../../futures>`.
-2. It provides a diagnostic dashboard that can provide valuable insight on
- performance and progress (see :doc:`dashboard`).
-3. It handles data locality with sophistication, and so can be more
- efficient than the multiprocessing scheduler on workloads that require
- multiple processes.
-
-This page describes various ways to set up Dask clusters on different hardware, either
-locally on your own machine or on a distributed cluster.
-
-You can continue reading or watch the screencast below:
-
-.. raw:: html
-
-
+.. grid:: 1 1 2 2
+
+ .. grid-item::
+ :columns: 12 12 5 5
+
+ Dask works well at many scales ranging from a single machine to clusters of
+ many machines. This section describes the many ways to deploy and run Dask,
+ including the following:
+
+ .. toctree::
+ :maxdepth: 1
+
+ deploying-python.rst
+ deploying-cli.rst
+ deploying-ssh.rst
+ deploying-docker.rst
+ deploying-hpc.rst
+ deploying-kubernetes.rst
+ deploying-cloud.rst
+ deploying-python-advanced.rst
+ deployment-considerations.rst
+
+ .. grid-item::
+ :columns: 12 12 7 7
+
+ .. figure:: images/dask-cluster-manager.svg
+
+ An overview of cluster management with Dask distributed.
+
+Single Machine
+--------------
If you import Dask, set up a computation, and call ``compute``, then you
-will use the single-machine scheduler by default.
+will use the the local threaded scheduler by default.
.. code-block:: python
import dask.dataframe as dd
df = dd.read_csv(...)
- df.x.sum().compute() # This uses the single-machine scheduler by default
+ df.x.sum().compute() # This uses threads in your local process by default
-To use the ``dask.distributed`` scheduler you must set up a ``Client``.
+Alternatively, you can set up a fully-featured Dask cluster on your local
+machine. This gives you access to multi-process computation and diagnostic
+dashboards.
.. code-block:: python
- from dask.distributed import Client
- client = Client(...) # Connect to distributed cluster and override default
- df.x.sum().compute() # This now runs on the distributed system
+ from dask.distributed import LocalCluster
+ cluster = LocalCluster()
+ client = cluster.get_client()
-There are many ways to start the distributed scheduler and worker components, however, the most straight forward way is to use a *cluster manager* utility class.
-
-.. code-block:: python
+ # Dask works as normal and leverages the infrastructure defined above
+ df.x.sum().compute()
- from dask.distributed import Client, LocalCluster
- cluster = LocalCluster() # Launches a scheduler and workers locally
- client = Client(cluster) # Connect to distributed cluster and override default
- df.x.sum().compute() # This now runs on the distributed system
+The ``LocalCluster`` cluster manager defined above is easy to use and works
+well on a single machine. It follows the same interface as all other Dask
+cluster managers, and so it's easy to swap out when you're ready to scale up.
-These *cluster managers* deploy a scheduler
-and the necessary workers as determined by communicating with the *resource manager*.
-All *cluster managers* follow the same interface, but with platform-specific configuration
-options, so you can switch from your local machine to a remote cluster with very minimal code changes.
-
-.. figure:: images/dask-cluster-manager.svg
- :scale: 50%
+.. code-block:: python
- An overview of cluster management with Dask distributed.
+ # You can swap out LocalCluster for other cluster types
-`Dask Jobqueue `_, for example, is a set of
-*cluster managers* for HPC users and works with job queueing systems
-(in this case, the *resource manager*) such as `PBS `_,
-`Slurm `_,
-and `SGE `_.
-Those workers are then allocated physical hardware resources.
+ from dask.distributed import LocalCluster
+ from dask_kubernetes import KubeCluster
-.. code-block:: python
+ # cluster = LocalCluster()
+ cluster = KubeCluster() # example, you can swap out for Kubernetes
- from dask.distributed import Client
- from dask_jobqueue import PBSCluster
- cluster = PBSCluster() # Launches a scheduler and workers on HPC via PBS
- client = Client(cluster) # Connect to distributed cluster and override default
- df.x.sum().compute() # This now runs on the distributed system
+ client = cluster.get_client()
.. _deployment-options:
@@ -111,12 +90,21 @@ and debugging by using the distributed scheduler.
- :doc:`dask.distributed `
The sophistication of the newer system on a single machine. This provides more advanced features while still requiring almost no setup.
-.. _deployment-distributed:
+Manual deployments (not recommended)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Distributed Computing
----------------------
+You can set up Dask clusters by hand, or with tools like SSH.
-There are a number of ways to run Dask on a distributed cluster (see the `Beginner's Guide to Configuring a Distributed Dask Cluster `_).
+- :doc:`Manual Setup `
+ The command line interface to set up ``dask-scheduler`` and ``dask-worker`` processes.
+- :doc:`deploying-ssh`
+ Use SSH to set up Dask across an un-managed cluster.
+- :doc:`Python API (advanced) `
+ Create ``Scheduler`` and ``Worker`` objects from Python as part of a distributed Tornado TCP application.
+
+However, we don't recommend this path. Instead, we recommend that you use
+some common resource manager to help you manage your machines, and then deploy
+Dask on that system. Those options are described below.
High Performance Computing
~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -153,21 +141,11 @@ See :doc:`deploying-cloud` for more details.
Constructing and managing ephemeral Dask clusters on AWS, DigitalOcean, Google Cloud, Azure, and Hetzner
- You can use `Coiled `_, a commercial Dask deployment option, to handle the creation and management of Dask clusters on cloud computing environments (AWS and GCP).
-Ad-hoc deployments
-~~~~~~~~~~~~~~~~~~
-
-- :doc:`Manual Setup `
- The command line interface to set up ``dask-scheduler`` and ``dask-worker`` processes.
-- :doc:`deploying-ssh`
- Use SSH to set up Dask across an un-managed cluster.
-- :doc:`Python API (advanced) `
- Create ``Scheduler`` and ``Worker`` objects from Python as part of a distributed Tornado TCP application.
-
.. _managed-cluster-solutions:
Managed Solutions
~~~~~~~~~~~~~~~~~
-- You can use `Coiled `_ to handle the creation and management of Dask clusters on cloud computing environments (AWS and GCP).
+- `Coiled `_ manages the creation and management of Dask clusters on cloud computing environments (AWS and GCP).
- `Domino Data Lab `_ lets users create Dask clusters in a hosted platform.
- `Saturn Cloud `_ lets users create Dask clusters in a hosted platform or within their own AWS accounts.