Skip to content

Commit

Permalink
Merge pull request #888 from rapidsai/branch-22.04
Browse files Browse the repository at this point in the history
[RELEASE] dask-cuda v22.04
  • Loading branch information
raydouglass authored Apr 6, 2022
2 parents a666e9b + 61322f3 commit 29a8e29
Show file tree
Hide file tree
Showing 34 changed files with 809 additions and 875 deletions.
8 changes: 8 additions & 0 deletions .github/ops-bot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# This file controls which features from the `ops-bot` repository below are enabled.
# - https://github.com/rapidsai/ops-bot

auto_merger: true
branch_checker: true
label_checker: true
release_drafter: true
external_contributors: false
75 changes: 56 additions & 19 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,72 @@
# dask-cuda 22.04.00 (6 Apr 2022)

## 🚨 Breaking Changes

- Introduce incompatible-types and enables spilling of CuPy arrays ([#856](https://github.com/rapidsai/dask-cuda/pull/856)) [@madsbk](https://github.com/madsbk)

## 🐛 Bug Fixes

- Resolve build issues / consistency with conda-forge packages ([#883](https://github.com/rapidsai/dask-cuda/pull/883)) [@charlesbluca](https://github.com/charlesbluca)
- Increase test_worker_force_spill_to_disk timeout ([#857](https://github.com/rapidsai/dask-cuda/pull/857)) [@pentschev](https://github.com/pentschev)

## 📖 Documentation

- Remove description from non-existing `--nprocs` CLI argument ([#852](https://github.com/rapidsai/dask-cuda/pull/852)) [@pentschev](https://github.com/pentschev)

## 🚀 New Features

- Add --pre-import/pre_import argument ([#854](https://github.com/rapidsai/dask-cuda/pull/854)) [@pentschev](https://github.com/pentschev)
- Remove support for UCX < 1.11.1 ([#830](https://github.com/rapidsai/dask-cuda/pull/830)) [@pentschev](https://github.com/pentschev)

## 🛠️ Improvements

- Raise `ImportError` when platform is not Linux ([#885](https://github.com/rapidsai/dask-cuda/pull/885)) [@pentschev](https://github.com/pentschev)
- Temporarily disable new `ops-bot` functionality ([#880](https://github.com/rapidsai/dask-cuda/pull/880)) [@ajschmidt8](https://github.com/ajschmidt8)
- Pin `dask` & `distributed` ([#878](https://github.com/rapidsai/dask-cuda/pull/878)) [@galipremsagar](https://github.com/galipremsagar)
- Upgrade min `dask` & `distributed` versions ([#872](https://github.com/rapidsai/dask-cuda/pull/872)) [@galipremsagar](https://github.com/galipremsagar)
- Add `.github/ops-bot.yaml` config file ([#871](https://github.com/rapidsai/dask-cuda/pull/871)) [@ajschmidt8](https://github.com/ajschmidt8)
- Make Dask CUDA work with the new WorkerMemoryManager abstraction ([#870](https://github.com/rapidsai/dask-cuda/pull/870)) [@shwina](https://github.com/shwina)
- Implement ProxifyHostFile.evict() ([#862](https://github.com/rapidsai/dask-cuda/pull/862)) [@madsbk](https://github.com/madsbk)
- Introduce incompatible-types and enables spilling of CuPy arrays ([#856](https://github.com/rapidsai/dask-cuda/pull/856)) [@madsbk](https://github.com/madsbk)
- Spill to disk clean up ([#853](https://github.com/rapidsai/dask-cuda/pull/853)) [@madsbk](https://github.com/madsbk)
- ProxyObject to support matrix multiplication ([#849](https://github.com/rapidsai/dask-cuda/pull/849)) [@madsbk](https://github.com/madsbk)
- Unpin max dask and distributed ([#847](https://github.com/rapidsai/dask-cuda/pull/847)) [@galipremsagar](https://github.com/galipremsagar)
- test_gds: skip if GDS is not available ([#845](https://github.com/rapidsai/dask-cuda/pull/845)) [@madsbk](https://github.com/madsbk)
- ProxyObject implement __array_function__ ([#843](https://github.com/rapidsai/dask-cuda/pull/843)) [@madsbk](https://github.com/madsbk)
- Add option to track RMM allocations ([#842](https://github.com/rapidsai/dask-cuda/pull/842)) [@shwina](https://github.com/shwina)

# dask-cuda 22.02.00 (2 Feb 2022)

## 🐛 Bug Fixes

- Ignoe `DepecationWaning` fom `distutils.Vesion` classes ([#823](https://github.com/rapidsai/dask-cuda/pull/823)) [@pentschev](https://github.com/pentschev)
- Handle explicitly disabled UCX tanspots ([#820](https://github.com/rapidsai/dask-cuda/pull/820)) [@pentschev](https://github.com/pentschev)
- Fix egex patten to match to in test_on_demand_debug_info ([#819](https://github.com/rapidsai/dask-cuda/pull/819)) [@pentschev](https://github.com/pentschev)
- Ignore `DeprecationWarning` from `distutils.Version` classes ([#823](https://github.com/rapidsai/dask-cuda/pull/823)) [@pentschev](https://github.com/pentschev)
- Handle explicitly disabled UCX transports ([#820](https://github.com/rapidsai/dask-cuda/pull/820)) [@pentschev](https://github.com/pentschev)
- Fix regex pattern to match to in test_on_demand_debug_info ([#819](https://github.com/rapidsai/dask-cuda/pull/819)) [@pentschev](https://github.com/pentschev)
- Fix skipping GDS test if cucim is not installed ([#813](https://github.com/rapidsai/dask-cuda/pull/813)) [@pentschev](https://github.com/pentschev)
- Unpin Dask and Distibuted vesions ([#810](https://github.com/rapidsai/dask-cuda/pull/810)) [@pentschev](https://github.com/pentschev)
- Unpin Dask and Distributed versions ([#810](https://github.com/rapidsai/dask-cuda/pull/810)) [@pentschev](https://github.com/pentschev)
- Update to UCX-Py 0.24 ([#805](https://github.com/rapidsai/dask-cuda/pull/805)) [@pentschev](https://github.com/pentschev)

## 📖 Documentation

- Fix Dask-CUDA vesion to 22.02 ([#835](https://github.com/rapidsai/dask-cuda/pull/835)) [@jakikham](https://github.com/jakikham)
- Mege banch-21.12 into banch-22.02 ([#829](https://github.com/rapidsai/dask-cuda/pull/829)) [@pentschev](https://github.com/pentschev)
- Claify `LocalCUDACluste`'s `n_wokes` docstings ([#812](https://github.com/rapidsai/dask-cuda/pull/812)) [@pentschev](https://github.com/pentschev)
- Fix Dask-CUDA version to 22.02 ([#835](https://github.com/rapidsai/dask-cuda/pull/835)) [@jakirkham](https://github.com/jakirkham)
- Merge branch-21.12 into branch-22.02 ([#829](https://github.com/rapidsai/dask-cuda/pull/829)) [@pentschev](https://github.com/pentschev)
- Clarify `LocalCUDACluster`'s `n_workers` docstrings ([#812](https://github.com/rapidsai/dask-cuda/pull/812)) [@pentschev](https://github.com/pentschev)

## 🚀 New Featues
## 🚀 New Features

- Pin `dask` & `distibuted` vesions ([#832](https://github.com/rapidsai/dask-cuda/pull/832)) [@galipemsaga](https://github.com/galipemsaga)
- Expose mm-maximum_pool_size agument ([#827](https://github.com/rapidsai/dask-cuda/pull/827)) [@VibhuJawa](https://github.com/VibhuJawa)
- Simplify UCX configs, pemitting UCX_TLS=all ([#792](https://github.com/rapidsai/dask-cuda/pull/792)) [@pentschev](https://github.com/pentschev)
- Pin `dask` & `distributed` versions ([#832](https://github.com/rapidsai/dask-cuda/pull/832)) [@galipremsagar](https://github.com/galipremsagar)
- Expose rmm-maximum_pool_size argument ([#827](https://github.com/rapidsai/dask-cuda/pull/827)) [@VibhuJawa](https://github.com/VibhuJawa)
- Simplify UCX configs, permitting UCX_TLS=all ([#792](https://github.com/rapidsai/dask-cuda/pull/792)) [@pentschev](https://github.com/pentschev)

## 🛠️ Impovements
## 🛠️ Improvements

- Add avg and std calculation fo time and thoughput ([#828](https://github.com/rapidsai/dask-cuda/pull/828)) [@quasiben](https://github.com/quasiben)
- sizeof test: incease toleance ([#825](https://github.com/rapidsai/dask-cuda/pull/825)) [@madsbk](https://github.com/madsbk)
- Quey UCX-Py fom gpuCI vesioning sevice ([#818](https://github.com/rapidsai/dask-cuda/pull/818)) [@pentschev](https://github.com/pentschev)
- Standadize Distibuted config sepaato in get_ucx_config ([#806](https://github.com/rapidsai/dask-cuda/pull/806)) [@pentschev](https://github.com/pentschev)
- Fixed `PoxyObject.__del__` to use the new Disk IO API fom #791 ([#802](https://github.com/rapidsai/dask-cuda/pull/802)) [@madsbk](https://github.com/madsbk)
- GPUDiect Stoage (GDS) suppot fo spilling ([#793](https://github.com/rapidsai/dask-cuda/pull/793)) [@madsbk](https://github.com/madsbk)
- Disk IO inteface ([#791](https://github.com/rapidsai/dask-cuda/pull/791)) [@madsbk](https://github.com/madsbk)
- Add avg and std calculation for time and throughput ([#828](https://github.com/rapidsai/dask-cuda/pull/828)) [@quasiben](https://github.com/quasiben)
- sizeof test: increase tolerance ([#825](https://github.com/rapidsai/dask-cuda/pull/825)) [@madsbk](https://github.com/madsbk)
- Query UCX-Py from gpuCI versioning service ([#818](https://github.com/rapidsai/dask-cuda/pull/818)) [@pentschev](https://github.com/pentschev)
- Standardize Distributed config separator in get_ucx_config ([#806](https://github.com/rapidsai/dask-cuda/pull/806)) [@pentschev](https://github.com/pentschev)
- Fixed `ProxyObject.__del__` to use the new Disk IO API from #791 ([#802](https://github.com/rapidsai/dask-cuda/pull/802)) [@madsbk](https://github.com/madsbk)
- GPUDirect Storage (GDS) support for spilling ([#793](https://github.com/rapidsai/dask-cuda/pull/793)) [@madsbk](https://github.com/madsbk)
- Disk IO interface ([#791](https://github.com/rapidsai/dask-cuda/pull/791)) [@madsbk](https://github.com/madsbk)

# dask-cuda 21.12.00 (9 Dec 2021)

Expand Down
11 changes: 6 additions & 5 deletions conda/recipes/dask-cuda/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,12 @@ requirements:
- setuptools
run:
- python
- dask>=2021.11.1,<=2022.01.0
- distributed>=2021.11.1,<=2022.01.0
- pynvml >=8.0.3
- numpy >=1.16.0
- numba >=0.53.1
- dask==2022.03.0
- distributed==2022.03.0
- pynvml>=11.0.0
- numpy>=1.16.0
- numba>=0.53.1
- click==8.0.4

test:
imports:
Expand Down
6 changes: 6 additions & 0 deletions dask_cuda/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
import sys

if sys.platform != "linux":
raise ImportError("Only Linux is supported by Dask-CUDA at this time")


import dask
import dask.dataframe.core
import dask.dataframe.shuffle
Expand Down
8 changes: 0 additions & 8 deletions dask_cuda/benchmarks/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,13 +116,6 @@ def parse_benchmark_args(description="Generic dask-cuda Benchmark", args_list=[]
dest="enable_rdmacm",
help="Disable RDMACM with UCX.",
)
parser.add_argument(
"--ucx-net-devices",
default=None,
type=str,
help="The device to be used for UCX communication, or 'auto'. "
"Ignored if protocol is 'tcp'",
)
parser.add_argument(
"--interface",
default=None,
Expand Down Expand Up @@ -195,7 +188,6 @@ def parse_benchmark_args(description="Generic dask-cuda Benchmark", args_list=[]

def get_cluster_options(args):
ucx_options = {
"ucx_net_devices": args.ucx_net_devices,
"enable_tcp_over_ucx": args.enable_tcp_over_ucx,
"enable_infiniband": args.enable_infiniband,
"enable_nvlink": args.enable_nvlink,
Expand Down
44 changes: 21 additions & 23 deletions dask_cuda/cli/dask_cuda_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,7 @@
type=str,
default=None,
help="""A unique name for the worker. Can be a string (like ``"worker-1"``) or
``None`` for a nameless worker. If used with ``--nprocs``, then the process number
will be appended to the worker name, e.g. ``"worker-1-0"``, ``"worker-1-1"``,
``"worker-1-2"``.""",
``None`` for a nameless worker.""",
)
@click.option(
"--memory-limit",
Expand Down Expand Up @@ -119,16 +117,22 @@
Logging will only be enabled if ``--rmm-pool-size`` or ``--rmm-managed-memory``
are specified.""",
)
@click.option(
"--rmm-track-allocations/--no-rmm-track-allocations",
default=False,
show_default=True,
help="""Track memory allocations made by RMM. If ``True``, wraps the memory
resource of each worker with a ``rmm.mr.TrackingResourceAdaptor`` that
allows querying the amount of memory allocated by RMM.""",
)
@click.option(
"--pid-file", type=str, default="", help="File to write the process PID.",
)
@click.option(
"--resources",
type=str,
default="",
help="""Resources for task constraints like ``"GPU=2 MEM=10e9"``. Resources are
applied separately to each worker process (only relevant when starting multiple
worker processes with ``--nprocs``).""",
help="""Resources for task constraints like ``"GPU=2 MEM=10e9"``.""",
)
@click.option(
"--dashboard/--no-dashboard",
Expand Down Expand Up @@ -248,21 +252,6 @@
help="""Set environment variables to enable UCX RDMA connection manager support,
requires ``--enable-infiniband``.""",
)
@click.option(
"--net-devices",
type=str,
default=None,
help="""Interface(s) used by workers for UCX communication. Can be a string (like
``"eth0"`` for NVLink or ``"mlx5_0:1"``/``"ib0"`` for InfiniBand), ``"auto"``
(requires ``--enable-infiniband``) to pick the optimal interface per-worker based on
the system's topology, or ``None`` to stay with the default value of ``"all"`` (use
all available interfaces).
.. warning::
``"auto"`` requires UCX-Py to be installed and compiled with hwloc support.
Unexpected errors can occur when using ``"auto"`` if any interfaces are
disconnected or improperly configured.""",
)
@click.option(
"--enable-jit-unspill/--disable-jit-unspill",
default=None,
Expand All @@ -281,6 +270,13 @@
help="""Use a different class than Distributed's default (``distributed.Worker``)
to spawn ``distributed.Nanny``.""",
)
@click.option(
"--pre-import",
default=None,
help="""Pre-import libraries as a Worker plugin to prevent long import times
bleeding through later Dask operations. Should be a list of comma-separated names,
such as "cudf,rmm".""",
)
def main(
scheduler,
host,
Expand All @@ -293,6 +289,7 @@ def main(
rmm_managed_memory,
rmm_async,
rmm_log_directory,
rmm_track_allocations,
pid_file,
resources,
dashboard,
Expand All @@ -310,9 +307,9 @@ def main(
enable_infiniband,
enable_nvlink,
enable_rdmacm,
net_devices,
enable_jit_unspill,
worker_class,
pre_import,
**kwargs,
):
if tls_ca_file and tls_cert and tls_key:
Expand Down Expand Up @@ -344,6 +341,7 @@ def main(
rmm_managed_memory,
rmm_async,
rmm_log_directory,
rmm_track_allocations,
pid_file,
resources,
dashboard,
Expand All @@ -359,9 +357,9 @@ def main(
enable_infiniband,
enable_nvlink,
enable_rdmacm,
net_devices,
enable_jit_unspill,
worker_class,
pre_import,
**kwargs,
)

Expand Down
38 changes: 7 additions & 31 deletions dask_cuda/cuda_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,37 +17,24 @@
enable_proctitle_on_children,
enable_proctitle_on_current,
)
from distributed.worker import parse_memory_limit
from distributed.worker_memory import parse_memory_limit

from .device_host_file import DeviceHostFile
from .initialize import initialize
from .proxify_host_file import ProxifyHostFile
from .utils import (
CPUAffinity,
PreImport,
RMMSetup,
_ucx_111,
cuda_visible_devices,
get_cpu_affinity,
get_n_gpus,
get_ucx_config,
get_ucx_net_devices,
nvml_device_index,
parse_device_memory_limit,
)


def _get_interface(interface, host, cuda_device_index, ucx_net_devices):
if host:
return None
else:
return interface or get_ucx_net_devices(
cuda_device_index=cuda_device_index,
ucx_net_devices=ucx_net_devices,
get_openfabrics=False,
get_network=True,
)


class CUDAWorker(Server):
def __init__(
self,
Expand All @@ -62,6 +49,7 @@ def __init__(
rmm_managed_memory=False,
rmm_async=False,
rmm_log_directory=None,
rmm_track_allocations=False,
pid_file=None,
resources=None,
dashboard=True,
Expand All @@ -77,9 +65,9 @@ def __init__(
enable_infiniband=None,
enable_nvlink=None,
enable_rdmacm=None,
net_devices=None,
jit_unspill=None,
worker_class=None,
pre_import=None,
**kwargs,
):
# Required by RAPIDS libraries (e.g., cuDF) to ensure no context
Expand Down Expand Up @@ -170,16 +158,6 @@ def del_pid_file():
"RMM managed memory and NVLink are currently incompatible."
)

if _ucx_111 and net_devices == "auto":
warnings.warn(
"Starting with UCX 1.11, `ucx_net_devices='auto' is deprecated, "
"it should now be left unspecified for the same behavior. "
"Please make sure to read the updated UCX Configuration section in "
"https://docs.rapids.ai/api/dask-cuda/nightly/ucx.html, "
"where significant performance considerations for InfiniBand with "
"UCX 1.11 and above is documented.",
)

# Ensure this parent dask-cuda-worker process uses the same UCX
# configuration as child worker processes created by it.
initialize(
Expand All @@ -188,8 +166,6 @@ def del_pid_file():
enable_infiniband=enable_infiniband,
enable_nvlink=enable_nvlink,
enable_rdmacm=enable_rdmacm,
net_devices=net_devices,
cuda_device_index=0,
)

if jit_unspill is None:
Expand Down Expand Up @@ -232,7 +208,7 @@ def del_pid_file():
loop=loop,
resources=resources,
memory_limit=memory_limit,
interface=_get_interface(interface, host, i, net_devices),
interface=interface,
host=host,
preload=(list(preload) or []) + ["dask_cuda.initialize"],
preload_argv=(list(preload_argv) or []) + ["--create-cuda-context"],
Expand All @@ -248,7 +224,9 @@ def del_pid_file():
rmm_managed_memory,
rmm_async,
rmm_log_directory,
rmm_track_allocations,
),
PreImport(pre_import),
},
name=name if nprocs == 1 or name is None else str(name) + "-" + str(i),
local_directory=local_directory,
Expand All @@ -258,8 +236,6 @@ def del_pid_file():
enable_infiniband=enable_infiniband,
enable_nvlink=enable_nvlink,
enable_rdmacm=enable_rdmacm,
net_devices=net_devices,
cuda_device_index=i,
)
},
data=data(nvml_device_index(i, cuda_visible_devices(i))),
Expand Down
Loading

0 comments on commit 29a8e29

Please sign in to comment.