Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
  • Loading branch information
rjzamora authored Sep 20, 2024
1 parent 40a638e commit d082cac
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/dask_cudf/source/best_practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -187,16 +187,16 @@ Reading Data
Tune the partition size
~~~~~~~~~~~~~~~~~~~~~~~

The ideal partition size is usually between 1/16 and 1/8 the memory
The ideal partition size is usually between 1/32 and 1/8 the memory
capacity of a single GPU. Increasing the partition size will typically
reduce the number of tasks in your workflow and improve the GPU utilization
for each task. However, if the partitions are too large, the risk of OOM
errors can become significant.

.. note::
As a general rule of thumb, aim for 1/16 in shuffle-intensive workflows
(e.g. large-scale sorting and joining), and 1/8 otherwise. For pathologically
skewed data distributions, it may be necessary to target 1/32 or smaller.
As a general rule of thumb, start with 1/32-1/16 in shuffle-intensive workflows
(e.g. large-scale sorting and joining), and 1/16-1/8 otherwise. For pathologically
skewed data distributions, it may be necessary to target 1/64 or smaller.
This rule of thumb comes from anecdotal optimization and OOM-debugging
experience. Since every workflow is different, choosing the best partition
size is both an art and a science.
Expand Down

0 comments on commit d082cac

Please sign in to comment.