-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] Fix inconsistency in predict()
output shape for 1-tree models
#6753
Changes from 7 commits
4c627cb
52c6a31
7bdbacb
088a2b9
b6bb3c9
ba39a6f
46a0ddc
03e41c6
c09202c
d73f189
eb256bc
e9101a1
6ad5c49
2b4bfd7
75489a5
6457499
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -15,7 +15,7 @@ | |||
import psutil | ||||
import pytest | ||||
from scipy.sparse import csr_matrix, isspmatrix_csc, isspmatrix_csr | ||||
from sklearn.datasets import load_svmlight_file, make_blobs, make_multilabel_classification | ||||
from sklearn.datasets import load_svmlight_file, make_blobs, make_multilabel_classification, make_regression | ||||
from sklearn.metrics import average_precision_score, log_loss, mean_absolute_error, mean_squared_error, roc_auc_score | ||||
from sklearn.model_selection import GroupKFold, TimeSeriesSplit, train_test_split | ||||
|
||||
|
@@ -2307,6 +2307,30 @@ def test_refit(): | |||
assert err_pred > new_err_pred | ||||
|
||||
|
||||
def test_refit_with_one_tree(): | ||||
X, y = load_breast_cancer(return_X_y=True) | ||||
lgb_train = lgb.Dataset(X, label=y) | ||||
params = {"objective": "binary", "verbosity": -1} | ||||
model = lgb.train(params, lgb_train, num_boost_round=1) | ||||
model_refit = model.refit(X, y) | ||||
assert isinstance(model_refit, lgb.Booster) | ||||
|
||||
X, y = make_regression(n_samples=10_000, n_features=10) | ||||
lgb_train = lgb.Dataset(X, label=y) | ||||
params = {"objective": "regression", "verbosity": -1} | ||||
model = lgb.train(params, lgb_train, num_boost_round=1) | ||||
model_refit = model.refit(X, y) | ||||
assert isinstance(model_refit, lgb.Booster) | ||||
|
||||
|
||||
def test_pred_leaf_output_shape(): | ||||
X, y = make_regression(n_samples=10_000, n_features=10) | ||||
dtrain = lgb.Dataset(X, label=y) | ||||
params = {"objective": "regression", "verbosity": -1} | ||||
assert lgb.train(params, dtrain, num_boost_round=1).predict(X, pred_leaf=True).shape == (10_000, 1) | ||||
assert lgb.train(params, dtrain, num_boost_round=2).predict(X, pred_leaf=True).shape == (10_000, 2) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Please put this test down here by other LightGBM/tests/python_package_test/test_engine.py Line 3827 in b33a12e
And let's please:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! I add multiple shape check tests for |
||||
|
||||
|
||||
def test_refit_dataset_params(rng): | ||||
# check refit accepts dataset_params | ||||
X, y = load_breast_cancer(return_X_y=True) | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this test for regression is going to repeat all of the same code (totally fine, repetition in test code can be helpful!), then let's please just make it a separate test case.
That way, the test could be targeted individually like
pytest './tests/python_package_test/test_engine.py::test_refit_with_one_tree_binary_classification'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I add multiclass example and split the test into in 03e41c6.