-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask][docs] initial setup for Dask docs #3822
Conversation
After LightGBM 3.2.0 is released, I'd like to make a pull request in
What is the value of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this! I left some small suggestions, but I agree with these changes.
python-package/lightgbm/dask.py
Outdated
@@ -384,6 +385,9 @@ def _predict(model, data, raw_score=False, pred_proba=False, pred_leaf=False, pr | |||
|
|||
|
|||
class _LGBMModel: | |||
def __init__(self): | |||
if not all((DASK_INSTALLED, PANDAS_INSTALLED, SKLEARN_INSTALLED)): | |||
raise LightGBMError('Dask, Pandas and Scikit-learn are required for this module') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise LightGBMError('Dask, Pandas and Scikit-learn are required for this module') | |
raise LightGBMError('dask, pandas and scikit-learn are required for lightgbm.dask') |
Instead of "this module", could you use the specific name? I think that makes the log message a little more useful standalone. It can be helpful for cases where people don't have direct access to the stack trace, which is required to understand what "this module" refers to.
For example, user code or other frameworks might write things like this
try:
dask_reg = DaskLGBMClassifier()
except LightGBMError as err:
log.fatal(err)
raise SomeOtherException("LightGBM training failed")
I also think packages should be referenced by their exact package names, not capitalized names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Addressed in acac78f.
@@ -344,7 +344,7 @@ def run(self): | |||
extras_require={ | |||
'dask': [ | |||
'dask[array]>=2.0.0', | |||
'dask[dataframe]>=2.0.0' | |||
'dask[dataframe]>=2.0.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, first time I noticed that was LGTM site:
https://lgtm.com/projects/g/microsoft/LightGBM?mode=tree
Great!
I believe this class is needed for extending scikit-learn features by supporting more objectives. For example, right now LightGBM supports cross-entropy application. LGBMClassifier cannot be used with this objective because scikit-learn checks targets to be LightGBM/python-package/lightgbm/sklearn.py Line 818 in ac706e1
One can workaround this with LGBMRegressor and cross_entropy objective but I don't think this is semantically correct.
Some other checks can be added in So in general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes look good, thanks very much!
Ok, thanks for the explanation! I've created #3845 for the feature request. |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Towards #3814.
Maybe we can ask to archive old dask-lightgbm repo?
Live demo: https://lightgbm.readthedocs.io/en/dask_docs/Python-API.html#dask-api.
Why don't we have the most general class
DaskLGBMModel
?