-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in cuDF merge benchmark #935
Comments
RAPIDS 21.12 and 22.02 perform better than 21.06. The regression appeared first in 22.04, see results below. RAPIDS 21.06 cuDF benchmark - 10 iterations
RAPIDS 21.12 cuDF benchmark - 10 iterations
RAPIDS 22.02 cuDF benchmark - 10 iterations
RAPIDS 22.04 cuDF benchmark - 10 iterations
|
The reason for this behavior is compression. Dask 2022.3.0 (RAPIDS 22.04) depends on lz4, whereas Dask 2022.1.0 (RAPIDS 22.02) doesn't. Distributed has by default the RAPIDS 22.04 (no compression)
RAPIDS 22.04 (lz4)
@quasiben @jakirkham do you have any ideas or suggestions on the best way to handle this? It feels to me like Dask-CUDA/Dask-cuDF should disable compression by default or find a suitable alternative to the CPU compression algorithms that are available by default. |
Good catch @pentschev !
I agree, we should disable compression by default for now. |
That is a good idea @madsbk , is this something we plan adding to Distributed? It would be good to do that and do some testing/profiling. |
This issue has been labeled |
For GPU data, compression is worse rather than better because it provokes device-to-host transfers when they are unnecessary. This is a short-term fix for rapidsai#935, in lieu of hooking up GPU-based compression algorithms.
Short-term fix disabling compression is in #957. |
This issue has been labeled |
Running the cuDF benchmark with RAPIDS 22.06 results in the following:
RAPIDS 22.06 cuDF benchmark
If we roll back one year, to RAPIDS 21.06 performance was substantially superior:
RAPIDS 21.06 cuDF benchmark
It isn't clear where this comes from, but potential candidates seem like Distributed, cuDF or Dask-CUDA itself.
The text was updated successfully, but these errors were encountered: