How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

darshanbint · 2021-06-28T15:24:33Z

Description

with reference to https://github.com/microsoft/LightGBM/blob/master/examples/python-guide/dask/binary-classification.py the best_iteration and best_score of the dask_model are None and {} respectively. What is the reason the output is None and {}? How to print these features similar to LightGBM model best_score and best_iteration?

jameslamb · 2021-06-29T03:02:52Z

@darshanbint thanks for using LightGBM, and especially for your interest in the Dask interface!

As I mentioned in #4409 (comment), please provide a reproducible example (including details on how you installed LightGBM and which version you installed) when you report an issue in this project. Without it, you are asking maintainers here to guess what your code looks like, which increases the time it takes to come to a resolution and draws maintainer attention away from other work on the project.

best_iteratiion_

As documented at https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.DaskLGBMClassifier.html#lightgbm-dasklgbmclassifier, best_iteration_ will only be defined if you are using early stopping, and early stopping is not yet available in the Dask interface. You can subscribe to #3712 to track the progress towards adding that feature.

best_score_

Just like in the non-Dask scikit-learn interface, best_score_ will only be defined if you pass eval sets. To pass eval sets in the Dask interface, you will need to install from master because that feature was only added very very recently and has not been released yet. (#4101).

For example, try installing Python package from the latest commit on master (b918b5b).

cd python-package
python setup.py install

then running the code below

import dask.array as da
from dask.distributed import Client, LocalCluster, wait
from lightgbm.dask import DaskLGBMRegressor
from sklearn.datasets import make_regression

# set  up Dask cluster
n_workers = 3
cluster = LocalCluster(n_workers=n_workers)
client = Client(cluster)
client.wait_for_workers(n_workers)
print(f"View the dashboard: {cluster.dashboard_link}")

# create training data and an additional eval set
def _make_dataset(n_samples):
    X, y = make_regression(n_samples=n_samples)
    dX = da.from_array(X, chunks=(1000, X.shape[1]))
    dy = da.from_array(y, chunks=1000)
    return dX, dy

# training data
dX, dy = _make_dataset(10_000)

# eval data
dX_e, dy_e = _make_dataset(2_000)

reg_params = {
    "client": client,
    "max_depth": 5,
    "objective": "regression_l1",
    "learning_rate": 0.1,
    "tree_learner": "data",
    "n_estimators": 100,
    "min_child_samples": 1
}

# model with eval sets
dask_reg = DaskLGBMRegressor(**reg_params)
dask_reg.fit(
    X=dX,
    y=dy,
    eval_set=[
        (dX, dy),
        (dX_e, dy_e)
    ]
)
print(dask_reg.best_score_)

# defaultdict(<class 'collections.OrderedDict'>, {'valid_0': OrderedDict([('l1', 23.068264930953998)]), 'valid_1': OrderedDict([('l1', 213.37329453022264)])})

# model without eval sets
dask_reg = DaskLGBMRegressor(**reg_params)
dask_reg.fit(
    X=dX,
    y=dy,
)
print(dask_reg.best_score_)

# defaultdict(<class 'collections.OrderedDict'>, {})

darshanbint · 2021-06-29T06:25:10Z

Cool thank you for answering!

github-actions · 2023-08-23T14:28:01Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

jameslamb added dask question labels Jun 29, 2021

jameslamb added the awaiting response label Jun 29, 2021

darshanbint closed this as completed Jun 29, 2021

no-response bot removed the awaiting response label Jun 29, 2021

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

darshanbint commented Jun 28, 2021 •

edited

Loading

jameslamb commented Jun 29, 2021 •

edited

Loading

darshanbint commented Jun 29, 2021

github-actions bot commented Aug 23, 2023

How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

How to print best_iteration, best_score with lightgbm.DaskLGBMClassifier model? #4417

Comments

darshanbint commented Jun 28, 2021 • edited Loading

Description

jameslamb commented Jun 29, 2021 • edited Loading

darshanbint commented Jun 29, 2021

github-actions bot commented Aug 23, 2023

darshanbint commented Jun 28, 2021 •

edited

Loading

jameslamb commented Jun 29, 2021 •

edited

Loading