-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Divide metric into train_metric and valid_metric #1912
Conversation
sample1.py import numpy
import lightgbm
from lightgbm.sklearn import LGBMRegressor, LGBMModel, LGBMRanker
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
X, y = make_regression(n_samples=100, n_features=10, random_state=42)
scaler = MinMaxScaler((0, 20))
y = scaler.fit_transform(y.reshape(-1, 1))
y = y.reshape(-1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
y_test = y_test.astype(int) # label should be int type
group_train = [10 for _ in range(7)]
group_test = [10 for _ in range(3)]
# create dataset for lightgbm
lgb_train = lightgbm.Dataset(X_train, y_train, group=group_train)
lgb_eval = lightgbm.Dataset(X_test, y_test, group=group_test)
# specify your configurations as a dict
params = {
'boosting_type': 'gbdt',
'objective': 'regression',
'valid_metric': 'ndcg',
'eval_at': [1, 3, 5],
'label_gain': list(range(0, numpy.max(y_test) + 1)),
'num_leaves': 15,
'learning_rate': 0.01,
'feature_fraction': 0.9,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'seed': 42,
}
gbm = lightgbm.train(params,
lgb_train,
num_boost_round=1000,
valid_sets=[lgb_eval],
early_stopping_rounds=10) sample2.py import numpy
import lightgbm
from lightgbm.sklearn import LGBMRegressor, LGBMModel, LGBMRanker
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
X, y = make_regression(n_samples=100, n_features=10, random_state=42)
scaler = MinMaxScaler((0, 20))
y = scaler.fit_transform(y.reshape(-1, 1))
y = y.reshape(-1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
y_test = y_test.astype(int) # label should be int type
group_train = [10 for _ in range(7)]
group_test = [10 for _ in range(3)]
# create dataset for lightgbm
lgb_train = lightgbm.Dataset(X_train, y_train, group=group_train)
lgb_eval = lightgbm.Dataset(X_test, y_test, group=group_test)
# specify your configurations as a dict
params = {
'boosting_type': 'gbdt',
'objective': 'regression',
'train_metric': ['l2', 'l1'],
'valid_metric': ['ndcg'],
'eval_at': [3, 5],
'label_gain': list(range(0, numpy.max(y_test) + 1)),
'num_leaves': 15,
'learning_rate': 0.01,
'feature_fraction': 0.9,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'seed': 42,
}
gbm = lightgbm.train(params,
lgb_train,
num_boost_round=1000,
valid_sets=[lgb_train, lgb_eval],
early_stopping_rounds=10) Both of sample scripts works. |
@henry0312 Can this be backward compatible ? |
@guolinke No, this change will break backward compatible, so I think we will need to release v3.x |
@henry0312 Can you please provide a short description of changes at cpp side to help indicate other places where something should be updated? |
@StrikerRUS Sure, please wait for a day. |
@henry0312 I think it is better for backward compatible, as this change is for some rare cases.
|
Is it possible? |
@henry0312
|
Thank you.
I will check later.
2018年12月18日(火) 19:13 Guolin Ke <[email protected]>:
@henry0312 <https://github.com/henry0312>
I think this could be done by some conditions, like:
if `metric` in params:
set metric to training data
set metric to validation data
else:
if `train_metric` in params:
set metric to training data
if `valid_metric` in params:
set metric to validation data
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1912 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAadGolAlkE5sTd-BgKcAVFb84ZcU1Mbks5u6L_EgaJpZM4ZWu9G>
.
--
Tsukasa OMOTO
[email protected]
|
`python helpers/parameter_generator.py`
http://www.csiss.org/SPACE/workshops/2004/SAC/files/fisher.pdf (Fisher (1958)) is not found. |
@henry0312 Fisher is OK now. I've re-run test. |
I divided We can pass different metrics to cf. #1912 (comment) |
I'm working on backward compatibility. It takes a while. |
@henry0312 Thanks a lot for the explanation! Is the default behavior (picking metric corresponding to the objective in case of no metric passed in params) preserved for both metrics? |
@guolinke @StrikerRUS ready for review :) |
Yes, but default value is set in c++ side. https://github.com/Microsoft/LightGBM/pull/1912/files#diff-2060c568af2dc566b6e950554a179e00R470 |
@henry0312 Great! As @guolinke has already mentioned, this use case is rather uncommon and therefore we should try to preserve the current behavior for ordinary users (backward compatibility). Due to this, I suppose that |
I think this change has backward compatibility because default value for metric is set in c++ as the same way as the current code. See: |
@@ -460,24 +467,12 @@ def fit(self, X, y, | |||
feval = _eval_function_wrapper(eval_metric) | |||
else: | |||
feval = None | |||
# register default metric for consistency with callable eval_metric case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change has backward compatibility because default value for metric is set in c++ as the same way as the current code.
Yeah, but I'm talking about the Python side (we had many issues about inconsistent behavior at Python side and this code solved them). So, this code should be leaved as is to prevent the future changes in current behavior. Please see this our old conversation (setting default metric only at cpp side is not enough): #1589 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henry0312 Please fix this (leave old code) to get this PR merged.
I think that in case of we don't introduce new arguments (please see #1912 (review)), it's enought just to add something like this after the old code:
if params['train_metric'] or params['valid_metric']:
params['metric'] = 'None'
to support this new behavior
if
metric
is set,train_metric
/valid_metric
will be overwritten
I believe that this is the most painless and not new-odd-behavior-introducing way to support new train/valid metric support 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henry0312 @guolinke I've decided to dive into the "metrics hell" again and started to write a big test which reflects the current behavior of metrics handling: 8265469 (metrics_test
branch). The next step is sklearn - the hardest part. After this, I'll create a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR has been created: #1954.
@henry0312 I hope it will help you abandon the idea to create more arguments for metrics 😄
I like the idea of separate metrics a lot, but seriously, corresponding keys in params
is quite enough.
@@ -330,7 +330,8 @@ def fit(self, X, y, | |||
sample_weight=None, init_score=None, group=None, | |||
eval_set=None, eval_names=None, eval_sample_weight=None, | |||
eval_class_weight=None, eval_init_score=None, eval_group=None, | |||
eval_metric=None, early_stopping_rounds=None, verbose=True, | |||
eval_metric=None, eval_train_metric=None, eval_valid_metric=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that these new args are really needed. One can achieve the needed result in their rare case with params
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that these new args are really needed.
One can achieve the needed result in their rare case with params.
I don't think they are not rere cases, because CatBoost, which is one of the most famous GBDT library and was acepted in NIPS 2018, supports setting of individual metrics to training and validation.
This PR (change) allows us to set evaluation metric different from objective function, which is useful and needed and thus CatBoost implemetned it.
cf. eval_metric
in
https://tech.yandex.com/catboost/doc/dg/concepts/python-reference_parameters-list-docpage/#python-reference_parameters-list.
However, as you you pointed, eval_metric
, eval_train_metric
and eval_valid_metric
are confusing for many users...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guolinke What do you think? My position is that corresponding keys in params
is enough because new arguments introduce huge number of possible combination which is very hard to support.
I remember we were getting rid of some arguments in favor of their aliases in params
some time ago (max_bin
argument is the first thing that comes to my mind). So, new args is against this policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I think to have 3 similar parameters is confusing. BTW, eval_metric
could be a function, while eval_train_metric
and eval_valid_metric
only can be string, these inconsistency will further cause confusing...
@guolinke It has passed a little while since I sent this PR, but I just need this function. |
@henry0312 sorry, I am being busy recently. I will check it now. |
@@ -1553,7 +1552,6 @@ def __init__(self, params=None, train_set=None, model_file=None, silent=False): | |||
# buffer for inner predict | |||
self.__inner_predict_buffer = [None] | |||
self.__is_predicted_cur_iter = [False] | |||
self.__get_eval_info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reload is deleted?
I think this will somehow affect the performance.
@@ -426,19 +426,20 @@ LGBM_SE LGBM_BoosterGetEvalNames_R(LGBM_SE handle, | |||
LGBM_SE buf_len, | |||
LGBM_SE actual_len, | |||
LGBM_SE eval_names, | |||
LGBM_SE call_state) { | |||
LGBM_SE call_state, | |||
LGBM_SE data_idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will break R package, you should update the R files accordingly.
BTW, the call state should be in the last argument.
@@ -82,7 +82,8 @@ def objective_ls(y_true, y_pred): | |||
X, y = load_boston(True) | |||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42) | |||
gbm = lgb.LGBMRegressor(n_estimators=50, silent=True, objective=objective_ls) | |||
gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], early_stopping_rounds=5, verbose=False) | |||
gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], eval_metric='regression', | |||
early_stopping_rounds=5, verbose=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we test these by only metric
, therefore, we can check the backward compatible.
} | ||
return ret; | ||
} | ||
|
||
int GetEvalNames(char** out_strs) const { | ||
int GetEvalNames(int data_idx, char** out_strs) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have per-data metric ?
|| (!objective_type_multiclass && metric_type_multiclass)) { | ||
Log::Fatal("Multiclass objective and metrics don't match"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can pack the checking into a function, and avoid the repeat code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is protection from accidental merge before #1912 (review) being resolved, because it breaks the consistency between cases with and without custom callable metric.
@henry0312 Hi! |
@henry0312 what is the status of this PR? |
I think this feature will be needed, but I have to rewrite, since it takes too long time (I'm sorry) and there are conflicts. |
Ah, you closed it! I wanted to congratulate with one-year anniversary next month!.. 😃 |
Close #1911