Divide metric into train_metric and valid_metric #1912

henry0312 · 2018-12-17T16:47:18Z

Close #1911

Close microsoft#1911

henry0312 · 2018-12-17T16:51:57Z

sample1.py
(eval_set has just valid set)

import numpy
import lightgbm
from lightgbm.sklearn import LGBMRegressor, LGBMModel, LGBMRanker
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler


X, y = make_regression(n_samples=100, n_features=10, random_state=42)
scaler = MinMaxScaler((0, 20))
y = scaler.fit_transform(y.reshape(-1, 1))
y = y.reshape(-1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
y_test = y_test.astype(int)  # label should be int type
group_train = [10 for _ in range(7)]
group_test = [10 for _ in range(3)]

# create dataset for lightgbm
lgb_train = lightgbm.Dataset(X_train, y_train, group=group_train)
lgb_eval = lightgbm.Dataset(X_test, y_test, group=group_test)

# specify your configurations as a dict
params = {
    'boosting_type': 'gbdt',
    'objective': 'regression',
    'valid_metric': 'ndcg',
    'eval_at': [1, 3, 5],
    'label_gain': list(range(0, numpy.max(y_test) + 1)),
    'num_leaves': 15,
    'learning_rate': 0.01,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'seed': 42,
}

gbm = lightgbm.train(params,
                     lgb_train,
                     num_boost_round=1000,
                     valid_sets=[lgb_eval],
                     early_stopping_rounds=10)

sample2.py
(eval_set has train and valid set)

import numpy
import lightgbm
from lightgbm.sklearn import LGBMRegressor, LGBMModel, LGBMRanker
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler


X, y = make_regression(n_samples=100, n_features=10, random_state=42)
scaler = MinMaxScaler((0, 20))
y = scaler.fit_transform(y.reshape(-1, 1))
y = y.reshape(-1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
y_test = y_test.astype(int)  # label should be int type
group_train = [10 for _ in range(7)]
group_test = [10 for _ in range(3)]

# create dataset for lightgbm
lgb_train = lightgbm.Dataset(X_train, y_train, group=group_train)
lgb_eval = lightgbm.Dataset(X_test, y_test, group=group_test)

# specify your configurations as a dict
params = {
    'boosting_type': 'gbdt',
    'objective': 'regression',
    'train_metric': ['l2', 'l1'],
    'valid_metric': ['ndcg'],
    'eval_at': [3, 5],
    'label_gain': list(range(0, numpy.max(y_test) + 1)),
    'num_leaves': 15,
    'learning_rate': 0.01,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'seed': 42,
}

gbm = lightgbm.train(params,
                     lgb_train,
                     num_boost_round=1000,
                     valid_sets=[lgb_train, lgb_eval],
                     early_stopping_rounds=10)

Both of sample scripts works.

guolinke · 2018-12-18T09:27:13Z

@henry0312 Can this be backward compatible ?

henry0312 · 2018-12-18T09:29:08Z

@guolinke No, this change will break backward compatible, so I think we will need to release v3.x

StrikerRUS · 2018-12-18T09:55:25Z

@henry0312 Can you please provide a short description of changes at cpp side to help indicate other places where something should be updated?

henry0312 · 2018-12-18T10:04:51Z

@StrikerRUS Sure, please wait for a day.

guolinke · 2018-12-18T10:06:04Z

@henry0312 I think it is better for backward compatible, as this change is for some rare cases.

metric and its aliases will set to both train_metric and valid_metric
train_metric only for the training dataset
valid_metric only for the validate dataset

henry0312 · 2018-12-18T10:08:03Z

metric and its aliases will set to both train_metric and valid_metric

Is it possible?
I don't know how to implement...

guolinke · 2018-12-18T10:13:23Z

@henry0312
I think this could be done by some conditions, like:

if `metric` in params:
  set metric to training data
  set metric to validation data
else:
  if `train_metric` in params:
    set metric to training data
  if `valid_metric` in params:
    set metric to validation data

henry0312 · 2018-12-18T10:39:13Z

Thank you. I will check later. 2018年12月18日(火) 19:13 Guolin Ke <[email protected]>:

@henry0312 <https://github.com/henry0312> I think this could be done by some conditions, like: if `metric` in params: set metric to training data set metric to validation data else: if `train_metric` in params: set metric to training data if `valid_metric` in params: set metric to validation data — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1912 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAadGolAlkE5sTd-BgKcAVFb84ZcU1Mbks5u6L_EgaJpZM4ZWu9G> .

-- Tsukasa OMOTO [email protected]

`python helpers/parameter_generator.py`

henry0312 · 2018-12-18T16:09:52Z

http://www.csiss.org/SPACE/workshops/2004/SAC/files/fisher.pdf (Fisher (1958)) is not found.
Perhaps, www.csiss.org is down now...
I don't know which work is Fisher, 1958.

StrikerRUS · 2018-12-18T20:20:08Z

@henry0312 Fisher is OK now. I've re-run test.

henry0312 · 2018-12-19T03:43:09Z

@StrikerRUS

Can you please provide a short description of changes at cpp side to help indicate other places where something should be updated?

I divided metric parameter into train_metric and valid_metric.
train_metric is used for evaluation for train dataset and valid_metric is used for evaluation for valid dataset.
(Currently, metric parameter is used for both train and valid dataset)

We can pass different metrics to train_metric and valid_metric individually
(e.g. train_metric=l2 and valid_metric=ndcg)

cf. #1912 (comment)

henry0312 · 2018-12-19T03:47:11Z

I'm working on backward compatibility. It takes a while.

StrikerRUS · 2018-12-19T10:49:42Z

@henry0312 Thanks a lot for the explanation! Is the default behavior (picking metric corresponding to the objective in case of no metric passed in params) preserved for both metrics?

henry0312 · 2018-12-20T14:54:59Z

@guolinke @StrikerRUS ready for review :)

henry0312 · 2018-12-20T15:05:50Z

@StrikerRUS

Is the default behavior (picking metric corresponding to the objective in case of no metric passed in params) preserved for both metrics?

Yes, but default value is set in c++ side.

https://github.com/Microsoft/LightGBM/pull/1912/files#diff-2060c568af2dc566b6e950554a179e00R470

StrikerRUS · 2018-12-20T15:10:37Z

@henry0312 Great!

As @guolinke has already mentioned, this use case is rather uncommon and therefore we should try to preserve the current behavior for ordinary users (backward compatibility). Due to this, I suppose that metric (and its' aliases) should have the highest priority over train_metric and valid_metric. Also, I believe that we don't need new arguments in Python code, they will be not often used by users and will simply confuse them, because one can achieve the same result by using params['train_metric']/params['valid_metric'].

henry0312 · 2018-12-20T15:19:14Z

@StrikerRUS

and therefore we should try to preserve the current behavior for ordinary users (backward compatibility).

I think this change has backward compatibility because default value for metric is set in c++ as the same way as the current code.
In other words, reggressin will be set if using LGBMRegressor, binarry_error will be set if using LGBMClassifier and ndcg will be set if using LGBMRanker.

See:

StrikerRUS · 2018-12-20T15:37:17Z

python-package/lightgbm/sklearn.py

@@ -460,24 +467,12 @@ def fit(self, X, y,
            feval = _eval_function_wrapper(eval_metric)
        else:
            feval = None
-            # register default metric for consistency with callable eval_metric case


I think this change has backward compatibility because default value for metric is set in c++ as the same way as the current code.

Yeah, but I'm talking about the Python side (we had many issues about inconsistent behavior at Python side and this code solved them). So, this code should be leaved as is to prevent the future changes in current behavior. Please see this our old conversation (setting default metric only at cpp side is not enough): #1589 (comment)

@henry0312 Please fix this (leave old code) to get this PR merged.
I think that in case of we don't introduce new arguments (please see #1912 (review)), it's enought just to add something like this after the old code:

if params['train_metric'] or params['valid_metric']: params['metric'] = 'None'

to support this new behavior

if metric is set, train_metric/valid_metric will be overwritten

I believe that this is the most painless and not new-odd-behavior-introducing way to support new train/valid metric support 😃

@henry0312 @guolinke I've decided to dive into the "metrics hell" again and started to write a big test which reflects the current behavior of metrics handling: 8265469 (metrics_test branch). The next step is sklearn - the hardest part. After this, I'll create a PR.

PR has been created: #1954.

@henry0312 I hope it will help you abandon the idea to create more arguments for metrics 😄
I like the idea of separate metrics a lot, but seriously, corresponding keys in params is quite enough.

StrikerRUS · 2018-12-20T15:39:35Z

python-package/lightgbm/sklearn.py

@@ -330,7 +330,8 @@ def fit(self, X, y,
            sample_weight=None, init_score=None, group=None,
            eval_set=None, eval_names=None, eval_sample_weight=None,
            eval_class_weight=None, eval_init_score=None, eval_group=None,
-            eval_metric=None, early_stopping_rounds=None, verbose=True,
+            eval_metric=None, eval_train_metric=None, eval_valid_metric=None,


I don't think that these new args are really needed. One can achieve the needed result in their rare case with params.

I don't think that these new args are really needed.
One can achieve the needed result in their rare case with params.

I don't think they are not rere cases, because CatBoost, which is one of the most famous GBDT library and was acepted in NIPS 2018, supports setting of individual metrics to training and validation.
This PR (change) allows us to set evaluation metric different from objective function, which is useful and needed and thus CatBoost implemetned it.

cf. eval_metric in
https://tech.yandex.com/catboost/doc/dg/concepts/python-reference_parameters-list-docpage/#python-reference_parameters-list.

However, as you you pointed, eval_metric, eval_train_metric and eval_valid_metric are confusing for many users...

@guolinke What do you think? My position is that corresponding keys in params is enough because new arguments introduce huge number of possible combination which is very hard to support.

I remember we were getting rid of some arguments in favor of their aliases in params some time ago (max_bin argument is the first thing that comes to my mind). So, new args is against this policy.

Indeed, I think to have 3 similar parameters is confusing. BTW, eval_metric could be a function, while eval_train_metric and eval_valid_metric only can be string, these inconsistency will further cause confusing...

henry0312 · 2019-01-15T08:13:35Z

@guolinke It has passed a little while since I sent this PR, but I just need this function.
Could you check the changes, especialy in c++?

guolinke · 2019-01-15T08:32:07Z

@henry0312 sorry, I am being busy recently. I will check it now.

guolinke · 2019-01-15T08:40:08Z

python-package/lightgbm/basic.py

@@ -1553,7 +1552,6 @@ def __init__(self, params=None, train_set=None, model_file=None, silent=False):
            # buffer for inner predict
            self.__inner_predict_buffer = [None]
            self.__is_predicted_cur_iter = [False]
-            self.__get_eval_info()


the reload is deleted?
I think this will somehow affect the performance.

guolinke · 2019-01-15T08:41:42Z

src/lightgbm_R.cpp

@@ -426,19 +426,20 @@ LGBM_SE LGBM_BoosterGetEvalNames_R(LGBM_SE handle,
  LGBM_SE buf_len,
  LGBM_SE actual_len,
  LGBM_SE eval_names,
-  LGBM_SE call_state) {
+  LGBM_SE call_state,
+  LGBM_SE data_idx) {


this will break R package, you should update the R files accordingly.
BTW, the call state should be in the last argument.

guolinke · 2019-01-15T08:44:07Z

tests/python_package_test/test_sklearn.py

@@ -82,7 +82,8 @@ def objective_ls(y_true, y_pred):
        X, y = load_boston(True)
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
        gbm = lgb.LGBMRegressor(n_estimators=50, silent=True, objective=objective_ls)
-        gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], early_stopping_rounds=5, verbose=False)
+        gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], eval_metric='regression',
+                early_stopping_rounds=5, verbose=False)


can we test these by only metric, therefore, we can check the backward compatible.

guolinke · 2019-01-15T08:45:39Z

src/c_api.cpp

    }
    return ret;
  }

-  int GetEvalNames(char** out_strs) const {
+  int GetEvalNames(int data_idx, char** out_strs) const {


do we have per-data metric ?

guolinke · 2019-01-15T08:47:12Z

src/io/config.cpp

+        || (!objective_type_multiclass && metric_type_multiclass)) {
+      Log::Fatal("Multiclass objective and metrics don't match");
+    }
+  }


we can pack the checking into a function, and avoid the repeat code.

henry0312 · 2019-01-15T09:07:22Z

@guolinke Thank you for your quick review while you are busy.
I'll also check and reply as soon as possible.

BTW, I put some ideas on #1934.

StrikerRUS

This is protection from accidental merge before #1912 (review) being resolved, because it breaks the consistency between cases with and without custom callable metric.

StrikerRUS · 2019-04-10T10:53:53Z

@henry0312 Hi!
Can you please tell what is the fate of this PR?

jameslamb · 2019-11-17T03:50:52Z

@henry0312 Hi!
Can you please tell what is the fate of this PR?

@henry0312 what is the status of this PR?

henry0312 · 2019-11-17T16:23:39Z

I think this feature will be needed, but I have to rewrite, since it takes too long time (I'm sorry) and there are conflicts.
For now, I close this PR.

StrikerRUS · 2019-11-17T18:26:53Z

Ah, you closed it! I wanted to congratulate with one-year anniversary next month!.. 😃

Divide metric into train_metric and valid_metric

bc0ea4a

Close microsoft#1911

henry0312 requested review from guolinke and StrikerRUS December 17, 2018 16:52

henry0312 assigned henry0312 and guolinke Dec 17, 2018

henry0312 added 2 commits December 18, 2018 17:24

fix sklearn API

7ce4d59

fix tests

e1f2bba

henry0312 added 2 commits December 18, 2018 18:37

fix examples

ab23d73

fix notebook

9327ffb

fix docs

b0e7a5c

henry0312 added 2 commits December 19, 2018 00:04

update Parameters.rst and config_auto.cpp

4a98ec0

`python helpers/parameter_generator.py`

fix

a276abb

henry0312 changed the title ~~Divide metric into train_metric and valid_metric~~ [WIP] Divide metric into train_metric and valid_metric Dec 18, 2018

henry0312 added 3 commits December 19, 2018 20:18

backward compatibility

bea2851

reset

e0ca9e0

add a test

95729b2

henry0312 added 4 commits December 20, 2018 13:01

update

b137ad5

backward compatibility of config

2fb6066

Merge branch 'master' into train_and_valid_metric

f11dc46

update documents

b89c1ec

henry0312 changed the title ~~[WIP] Divide metric into train_metric and valid_metric~~ Divide metric into train_metric and valid_metric Dec 20, 2018

StrikerRUS reviewed Dec 20, 2018

View reviewed changes

guolinke reviewed Jan 15, 2019

View reviewed changes

StrikerRUS requested changes Jan 15, 2019

View reviewed changes

StrikerRUS added the in progress label Apr 1, 2019

StrikerRUS mentioned this pull request Aug 1, 2019

objective = regression and metric = ndcg #1911

Closed

henry0312 closed this Nov 17, 2019

StrikerRUS added the feature label Mar 1, 2020

lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020

jameslamb removed the in progress label Dec 10, 2023

Divide metric into train_metric and valid_metric #1912

Divide metric into train_metric and valid_metric #1912

Conversation

henry0312 commented Dec 17, 2018 • edited Loading

henry0312 commented Dec 17, 2018 • edited Loading

guolinke commented Dec 18, 2018

henry0312 commented Dec 18, 2018

StrikerRUS commented Dec 18, 2018

henry0312 commented Dec 18, 2018

guolinke commented Dec 18, 2018

henry0312 commented Dec 18, 2018

guolinke commented Dec 18, 2018

henry0312 commented Dec 18, 2018 via email

henry0312 commented Dec 18, 2018

StrikerRUS commented Dec 18, 2018

henry0312 commented Dec 19, 2018

henry0312 commented Dec 19, 2018

StrikerRUS commented Dec 19, 2018

henry0312 commented Dec 20, 2018

henry0312 commented Dec 20, 2018

StrikerRUS commented Dec 20, 2018

henry0312 commented Dec 20, 2018 • edited Loading

Choose a reason for hiding this comment

StrikerRUS Jan 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StrikerRUS Dec 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henry0312 commented Jan 15, 2019

guolinke commented Jan 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henry0312 commented Jan 15, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

StrikerRUS commented Apr 10, 2019

jameslamb commented Nov 17, 2019

henry0312 commented Nov 17, 2019

StrikerRUS commented Nov 17, 2019

henry0312 commented Dec 17, 2018 •

edited

Loading

henry0312 commented Dec 17, 2018 •

edited

Loading

henry0312 commented Dec 20, 2018 •

edited

Loading

StrikerRUS Jan 15, 2019 •

edited

Loading

StrikerRUS Dec 20, 2018 •

edited

Loading