-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support MPI in Dask #3831
Comments
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
Thanks for writing this up! When I tried to use the existing I can provide a reproducible example with specific logs in the future, sorry that I don't have them readily available right now. Or anyone who's interested can follow https://github.com/jameslamb/lightgbm-dask-testing and change the installation instructions to build with MPI support, based on the links in this issue's description. |
Just tried to run all Dask tests with a version of And it seems that a lot of things have been changed since the last update. Short summary:
Full testing logs:
I guess that first 4 failures are quite easy to "fix" by slightly relaxing the asserting accuracy, e.g.
or
Also, sometimes All other failures (all parametrized variants of |
Hi, I am looking for the
|
Summary
Dask currently only supports pure socket-based training.
https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html#socket-version
Motivation
Adding this feature would allow users to perform more efficient training because LightGBM has native support of MPI.
References
#3515 (comment)
LightGBM/tests/python_package_test/test_dask.py
Line 36 in da44387
LightGBM/CMakeLists.txt
Line 1 in da44387
https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html#mpi-version
http://mpi.dask.org/en/latest/
https://blog.dask.org/2019/01/31/dask-mpi-experiment
The text was updated successfully, but these errors were encountered: