[python-package][R-package] load parameters from model file (fixes #2613) #5424

jmoralez · 2022-08-16T02:12:06Z

Given that #4802 seems to have stalled, this intends to supersede it, mainly to make #4323 easier by having access to the features that would have to be turned into factors and also unblock #5246.

Fixes #2613.

jmoralez · 2022-08-16T02:16:08Z

@StrikerRUS @jameslamb I'd appreciate your take on this approach. If you agree I can make the required changes to the R-package as well. The main idea is:

Parse the contents of loaded_parameter_ into a JSON string.
Parse the parameter types from the config.
Load the parameters as strings in the wrappers and parse them to their corresponding types.
Assign the result as the parameters of the created Booster when calling the load_from_string/load_from_file methods.

python-package/lightgbm/basic.py

StrikerRUS · 2022-08-28T02:05:40Z

I'm afraid some floating point values will not survive float->string->float round trip...
Do we need something like the following?
dmlc/xgboost#5772

jmoralez · 2022-08-28T03:10:01Z

I'm afraid some floating point values will not survive float->string->float round trip...

Yes, this will probably loose precision on some cases. We can address loading the parameters here and maybe have a feature request for making sure we get matches that are as close as possible for floating point values.

StrikerRUS · 2022-08-28T16:08:40Z

Well, I don't think there can be a better solution other than a proposed one.

… check interaction constraints are properly loaded

jmoralez · 2022-08-29T17:11:41Z

I don't think precision loss in the parameters could be that big of a deal. There's already some on the thresholds and leaf values when writing the model file, so the parameters should be ok.

jameslamb

Left one quick comment about testing which might help in your work on this. I'll try to review more thoroughly later this week.

R-package/tests/testthat/test_lgb.Booster.R

jmoralez · 2022-08-30T14:55:25Z

@shiyu1994 does cuda_exp set force_col_wise to false?

StrikerRUS · 2022-08-30T15:11:24Z

@jmoralez

does cuda_exp set force_col_wise to false?

Yes. Some GPU parameters are hardcoded:

LightGBM/src/io/config.cpp

Lines 336 to 355 in 83627ff

    
           if (device_type == std::string("gpu") || device_type == std::string("cuda")) { 
        
             // force col-wise for gpu, and cuda version 
        
             force_col_wise = true; 
        
             force_row_wise = false; 
        
             if (deterministic) { 
        
               Log::Warning("Although \"deterministic\" is set, the results ran by GPU may be non-deterministic."); 
        
             } 
        
           } else if (device_type == std::string("cuda_exp")) { 
        
             // force row-wise for cuda_exp version 
        
             force_col_wise = false; 
        
             force_row_wise = true; 
        
             if (deterministic) { 
        
               Log::Warning("Although \"deterministic\" is set, the results ran by GPU may be non-deterministic."); 
        
             } 
        
           } 
        
           // force gpu_use_dp for CUDA 
        
           if (device_type == std::string("cuda") && !gpu_use_dp) { 
        
             Log::Warning("CUDA currently requires double precision calculations."); 
        
             gpu_use_dp = true; 
        
           }

shiyu1994 · 2022-08-31T02:37:32Z

@jmoralez
Yes, cuda_exp forces force_col_wise=false and force_row_wise=true.

R-package/R/lgb.Booster.R

jameslamb · 2022-08-31T21:29:26Z

/gha run r-valgrind

Workflow R valgrind tests has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/2967193209

Status: success ✔️.

jameslamb

I was able to review this thoroughly today.

I love that you were able to accomplish this with only private changes to the R and Python package, and one new public entrypoint in c_api. Nice work! That makes me confident that we could change this implementation in the future without breaking users.

I don't have any specific suggestions at the moment (tests look awesome!), but I asked two questions that I feel I need a better understanding of before I can approve this.

jameslamb · 2022-08-31T19:26:51Z

python-package/lightgbm/basic.py

@@ -2764,6 +2764,7 @@ def __init__(
                ctypes.byref(out_num_class)))
            self.__num_class = out_num_class.value
            self.pandas_categorical = _load_pandas_categorical(file_name=model_file)
+            params = self._get_loaded_param()


Suggested change

params = self._get_loaded_param()

params_from_model_file = self._get_loaded_param()

params = {**params_from_model_file, **params}

In this code block, there are two sources of parameter values:

params keyword argument from the constructor

parameters parsed out of the model file

I think that in the R and Python packages, wherever those conflict, the value passed through the keyword argument params should be preferred. But I'd like to hear your and @StrikerRUS 's opinions before you make any changes.

Why we might want to support this behavior

That's consistent with how the R and Python packages treat the params argument generally (#4904 (comment)).

And it'd be useful for non-Dask distributed training (for example), where some of the parameters like machines might change between the training run that produced an initial model and another one that performs training continuation.

Why we might not want to support this behavior

I believe that the R and Python packages already ignore passed-in params if you're creating a Booster from a model file.

So maybe we want to continue to ignore them until a specific user report or reproducible example demonstrates that it's a problem or surprising.

I prefer ignoring them, since they won't be used by the loaded booster and it could cause confusion. We can raise a warning when passing both params and model_file to the Booster constructor to make this more evident. When using the loaded booster as init model users can override the parameters for the new iterations in the train function, so that should be enough.

I haven't done much incremental learning so I hadn't given much thought to this haha, please feel free to correct me.

We can raise a warning when passing both params and model_file to the Booster constructor to make this more evident.

I strongly believe we need a warning for users here.

And I think it's better to override loaded params from model string by the params passed in via kwargs.

Sure I can add the warning. What would be an example use case for overriding the parameters from the model file when loading the booster?

Anything around continued training might result in loading a model from a file and then performing additional training with different parameters.

For examples:

my example in this thread, about wanting to change machines (since the IP addresses of machines used for distributed training might be different from when the model file was created)

wanting to use a different num_leaves or learning_rate when performing an update like "train a few more iterations on newly-arrived data" (see my explanation in this Stack Overflow answer)

But I'm struggling to understand when Booster.update would be called on a model loaded from file. It requires a bit of setup like defining the training set, and if you pass the training set to the constructor it takes a different path where it calls BoosterCreate with only the training set and the parameters.

The other case would be passing it to lgb.train as init_model and that already allows you to define the parameters for the new iterations.

jameslamb · 2022-08-31T21:55:06Z

tests/python_package_test/test_engine.py

+    assert set_params == params
+    assert bst.params['categorical_feature'] == [1, 2]


These tests are checking that the params attribute on the Python Booster object matches what is in the file...but that being true doesn't necessarily guarantee that if you used this Booster for continued training that it would still actually use those parameters at the C++ side, right?

I'm bringing this up because I'm struggling to understand the relationship between GBDT::loaded_parameter_

LightGBM/src/boosting/gbdt_model_text.cpp

Line 600 in 9e89ee7

loaded_parameter_ = ss.str();

and GBDT::config_

LightGBM/src/boosting/gbdt.h

Lines 461 to 462 in 9e89ee7

/*! \brief Config of gbdt */

std::unique_ptr<Config> config_;

In #2613 (comment), I'd recommended using GBDT::config_ as the source of the parameter information, because it seems to me like the parameters section of the model file is loaded up into that property loaded_parameter_, but not actually used to configure the GBDT object. But now I'm not sure that that's right either haha.

For example....if you tried continued training after loading a model from text file like this, would LightGBM respect feature_fraction=0.7? Or would it fall back to the default of 1.0?

that being true doesn't necessarily guarantee that if you used this Booster for continued training that it would still actually use those parameters at the C++ side, right?

I believe the loaded_parameter_ attribute was added in #1495 to have the model parameters stored, but it isn't actually used anywhere other than when writing the parameters back to a file when there's no config.

LightGBM/src/boosting/gbdt_model_text.cpp

Lines 393 to 401 in d78b6bc

if (config_ != nullptr) {

ss << "\nparameters:" << '\n';

ss << config_->ToString() << "\n";

ss << "end of parameters" << '\n';

} else if (!loaded_parameter_.empty()) {

ss << "\nparameters:" << '\n';

ss << loaded_parameter_ << "\n";

ss << "end of parameters" << '\n';

}

Not sure when this happens though, maybe from the CLI using refit task.

So I see these parameters more as "informative" of how the Booster was trained, rather than to be used to continue training. The original issue wanted to have objective be loaded back to be used in SHAP and I think having them may help trying to replicate previous trainings or having the categorical_feature for inference. I actually thought of maybe having an argument to decide whether or not to load them, they may not be useful and that computation could be skipped.

@guolinke Could you please help to understand the connection between config_ and loaded_parameter_?

IMO, I think the "loaded_parameter_" is used to check the parameters used for the model file. For continued training, the current (new) "config_" is used.

I got them from loaded_parameter_ because it seemed easier. If the contents of both attributes are equivalent we can document that, i.e. when loading a model from file the contents of the booster's parameter attribute is also the configuration for continued training.

@jmoralez @jameslamb any further comments about this?

I added a warning about passing parameters to the booster constructor in 9467814 because in order to be able to call Booster.update you need a training set, and if you initialize a booster with the training set lgb.Booster(model_file=..., train_set=...) the model_file argument gets ignored, so I don't think there's a way to do incremental training based on the Booster object. Thus, I don't think it really matters what the parameters for the loaded booster are on the cpp side, because the incremental training would be through lgb.train that already supports overriding the parameters for the new iterations. WDYT?

it sounds good to me

Thanks @jmoralez

It's an excellent point that Booster's constructor has logic like

if train_set is not None: ... elif model_file is not None: ...

Meaning that you can't have both. I agree with the warning you've added and appreciate you adding a test for it as well!

I'm sorry for holding up this PR for so long over this. I'm still confused about the relationship between config_ and loaded_parameter_, but I don't have any more concerns about this specific PR.

I should also add, as I mentioned in #5424 (review) ... I think we should merge this because future changes probably wouldn't require any user-facing breaking changes. The fact that the new function you've added in c_api takes in a BoosterHandle means that we could swap it to using a different property of the Booster (e.g. config_ instead of loaded_parameter_) in the future without an API-level breaking change. Very nice!

jameslamb

Thanks for the awesome work, and sorry I held this up for so long. The logic for how parameters flow through LightGBM is fairly complicated, and I wanted to be sure I understood the full implications of this change.

I think we should merge this 😀

jameslamb · 2022-10-11T17:41:03Z

@jmoralez You put SO MUCH work into this, only seems right that you be the one to push the merge button 😊

github-actions · 2023-08-19T03:04:25Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

jmoralez added 3 commits August 12, 2022 01:22

initial work to retrieve parameters from loaded booster

7399fe1

get parameter types and use to parse

02ca63a

add test

c81f768

merge master

2ea2c29

jmoralez changed the title ~~Retrieve params~~ load parameters from model file Aug 16, 2022

jmoralez added 8 commits August 16, 2022 14:12

True for boolean field if it's equal to '1'

b33d6a0

remove bound on cache

c7a6a22

remove duplicated code

f43934e

merge remote

edf11fc

manually parse json string

7761124

dont create temporary map. lint

26ba91f

add doc

ec113c0

minor fixes

39c7a8c

StrikerRUS reviewed Aug 28, 2022

View reviewed changes

python-package/lightgbm/basic.py Outdated Show resolved Hide resolved

jmoralez added 3 commits August 27, 2022 21:53

revert _get_string_from_c_api. rename parameter to param

0e6591b

add R-package functions

d4e781b

merge master

c574a4a

rename functions to BoosterGetLoadedParam. override array parameters.…

483a3f4

… check interaction constraints are properly loaded

jmoralez marked this pull request as ready for review August 29, 2022 17:04

jmoralez requested review from guolinke, shiyu1994 and jameslamb as code owners August 29, 2022 17:04

jameslamb requested changes Aug 29, 2022

View reviewed changes

R-package/tests/testthat/test_lgb.Booster.R Outdated Show resolved Hide resolved

jmoralez added 2 commits August 29, 2022 18:45

add missing types to tests

4ab5dd4

fix R params

bd4eec0

assert equal dicts

9a00fde

jmoralez added 3 commits August 30, 2022 10:15

use boost_from_average as boolean param

de6ef8a

set boost_from_average to false

2cec692

simplify R's parse_param

f066dba

parse types on cpp side

db36cb9

jmoralez commented Aug 31, 2022

View reviewed changes

R-package/R/lgb.Booster.R Show resolved Hide resolved

jameslamb self-requested a review August 31, 2022 19:38

jameslamb requested changes Aug 31, 2022

View reviewed changes

jameslamb mentioned this pull request Aug 31, 2022

WIP: Load back parameters from saved model file (fixes #2613) #4802

Closed

jmoralez added the feature label Sep 1, 2022

jmoralez changed the title ~~load parameters from model file~~ [python-package][R-package] load parameters from model file (fixes #2613) Sep 1, 2022

jmoralez added 5 commits September 21, 2022 14:02

warn about ignoring parameters passed to constructor

9467814

Merge branch 'master' into retrieve-params

339bb1c

trigger ci

4cbf477

Merge branch 'master' into retrieve-params

6667771

trigger ci

17ad0c1

jameslamb approved these changes Oct 11, 2022

View reviewed changes

jmoralez merged commit 8b72084 into master Oct 11, 2022

jmoralez deleted the retrieve-params branch October 11, 2022 17:59

jmoralez mentioned this pull request Oct 13, 2022

[python-package] params are lost when copying a Booster #5539

Open

StrikerRUS mentioned this pull request Oct 23, 2022

[docs] Improve docs: fix consistency of dots in C API and add notes about new time-costs Python-package build option #5554

Merged

OfekShilon mentioned this pull request Nov 27, 2022

[R-package] Loading binary dataset does not consider the creation non-default params #4904

Open

github-actions bot locked as resolved and limited conversation to collaborators Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package][R-package] load parameters from model file (fixes #2613) #5424

[python-package][R-package] load parameters from model file (fixes #2613) #5424

jmoralez commented Aug 16, 2022

jmoralez commented Aug 16, 2022

StrikerRUS commented Aug 28, 2022

jmoralez commented Aug 28, 2022

StrikerRUS commented Aug 28, 2022

jmoralez commented Aug 29, 2022

jameslamb left a comment

jmoralez commented Aug 30, 2022

StrikerRUS commented Aug 30, 2022

shiyu1994 commented Aug 31, 2022

jameslamb commented Aug 31, 2022 •

edited by guolinke

Loading

jameslamb left a comment

jameslamb Aug 31, 2022

jmoralez Sep 1, 2022 •

edited

Loading

StrikerRUS Sep 4, 2022

jmoralez Sep 15, 2022

jameslamb Sep 16, 2022

jmoralez Sep 19, 2022

jameslamb Aug 31, 2022

jmoralez Sep 1, 2022

StrikerRUS Sep 4, 2022

guolinke Sep 8, 2022 •

edited

Loading

jmoralez Sep 8, 2022

guolinke Oct 9, 2022

jmoralez Oct 10, 2022

guolinke Oct 11, 2022

jameslamb Oct 11, 2022

jameslamb Oct 11, 2022

jameslamb left a comment

jameslamb commented Oct 11, 2022 •

edited

Loading

github-actions bot commented Aug 19, 2023

	params = self._get_loaded_param()
	params_from_model_file = self._get_loaded_param()
	params = {params_from_model_file, params}

		assert set_params == params
		assert bst.params['categorical_feature'] == [1, 2]

	/! \brief Config of gbdt /
	std::unique_ptr<Config> config_;

	if (config_ != nullptr) {
	ss << "\nparameters:" << '\n';
	ss << config_->ToString() << "\n";
	ss << "end of parameters" << '\n';
	} else if (!loaded_parameter_.empty()) {
	ss << "\nparameters:" << '\n';
	ss << loaded_parameter_ << "\n";
	ss << "end of parameters" << '\n';
	}

[python-package][R-package] load parameters from model file (fixes #2613) #5424

[python-package][R-package] load parameters from model file (fixes #2613) #5424

Conversation

jmoralez commented Aug 16, 2022

jmoralez commented Aug 16, 2022

StrikerRUS commented Aug 28, 2022

jmoralez commented Aug 28, 2022

StrikerRUS commented Aug 28, 2022

jmoralez commented Aug 29, 2022

jameslamb left a comment

Choose a reason for hiding this comment

jmoralez commented Aug 30, 2022

StrikerRUS commented Aug 30, 2022

shiyu1994 commented Aug 31, 2022

jameslamb commented Aug 31, 2022 • edited by guolinke Loading

jameslamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmoralez Sep 1, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guolinke Sep 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jameslamb left a comment

Choose a reason for hiding this comment

jameslamb commented Oct 11, 2022 • edited Loading

github-actions bot commented Aug 19, 2023

jameslamb commented Aug 31, 2022 •

edited by guolinke

Loading

jmoralez Sep 1, 2022 •

edited

Loading

guolinke Sep 8, 2022 •

edited

Loading

jameslamb commented Oct 11, 2022 •

edited

Loading