Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] ensure that all callbacks are pickleable #5080

Closed
4 tasks done
jameslamb opened this issue Mar 17, 2022 · 5 comments · Fixed by #5109
Closed
4 tasks done

[python-package] ensure that all callbacks are pickleable #5080

jameslamb opened this issue Mar 17, 2022 · 5 comments · Fixed by #5109

Comments

@jameslamb
Copy link
Collaborator

jameslamb commented Mar 17, 2022

Summary

All callback functions should be serializable with pickle, clooudpickle, and joblib.

Motivation

As described in #5012, in some interfaces to LightGBM, it's necessary to broadcast LightGBM parameters or objects from the lightgbm Python library to multiple processes / machines. The primary mechanism for this in Python is to serialize objects with a library like pickle, cloudpickle, or joblib.

#5012 made lgb.callback.early_stopping serializable, for the benefit of lightgbm-ray. I expect that would also be necessary to use callbacks in lightgbm.dask, and for any other settings where users want to pickle/unpickle LightGBM objects.

Description

Functions in lightgbm.callback:

Tests similar to the following should be added for each of these functions, to ensure that they're pickleable and that that remains true after future changes to this project.

@pytest.mark.parametrize('serializer', ["pickle", "joblib", "cloudpickle"])
def test_early_stopping_callback_is_picklable(serializer, tmp_path):
callback = lgb.early_stopping(stopping_rounds=5)
tmp_file = tmp_path / "early_stopping.pkl"
pickle_obj(
obj=callback,
filepath=tmp_file,
serializer=serializer
)
callback_from_disk = unpickle_obj(
filepath=tmp_file,
serializer=serializer
)
assert callback.stopping_rounds == callback_from_disk.stopping_rounds

It's possible that just adding such tests will be enough, and that the remaining functions are already pickleable. But if not, then changes will need to be made to support this.

@jameslamb
Copy link
Collaborator Author

Closing this issue and adding it to #2302, per this project's policy for feature requests.

Anyone is welcome to contribute this, and a fix can be done across multiple pull requests. If you're interested in working on this, please comment below and maintainers will re-open the issue.

@StrikerRUS
Copy link
Collaborator

@jameslamb Thanks a lot for writing this up!

It's possible that just adding such tests will be enough, and that the remaining functions are already pickleable. But if not, then changes will need to be made to support this.

Let me kindly disagree with this statement. I believe we should refactor current interface converting it into classes anyway for the sake of code readability and to avoid any unintended side effects.

@StrikerRUS
Copy link
Collaborator

Also, I think it worth mentioning that there shouldn't be any code refactoring that assumes that all callbacks are pickleable.

We should allow to use third-party callbacks that are unpickleable: #4719.

@jameslamb
Copy link
Collaborator Author

I believe we should refactor current interface converting it into classes anyway for the sake of code readability and to avoid any unintended side effects.

I don't disagree with the idea that all callbacks should be callable classes. I just think that's larger than the scope of what I'm proposing in this issue.

If we want to change the callbacks to callable classes for reasons other than pickleability, then I'd consider that separate from this work. If you'd prefer to change this feature request to "convert all callbacks to callable classes", and add that one of the conditions for that is "test that all of them are pickleable", that's fine with me. I just was thinking it made it a bit easier to separate those two concerns.

there shouldn't be any code refactoring that assumes that all callbacks are pickleable.

Good clarification. I was referring only to those built into the lightgbm Python package.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants