[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

aptlin · 2021-10-26T13:58:12Z

Description

Neptune-LightGBM integration is broken due to changes in handling the callbacks in v3.3.0.

Reproducible example

Install the dependencies

pip install lightgbm==3.3.0 neptune-client==0.10.10 neptune-lightgbm==0.9.13

Run the example:

import copy
from neptune.new.integrations.lightgbm import NeptuneCallback as LGBMNeptuneCallback
import neptune.new as npt
import numpy as np
import pandas as pd
from scipy.special import expit

import lightgbm as lgb

#################
# Taken from https://github.com/microsoft/LightGBM/blob/master/examples/python-guide/logistic_regression.py
# Simulate some binary data with a single categorical and
#   single continuous predictor
np.random.seed(0)
N = 1000
X = pd.DataFrame({
    'continuous': range(N),
    'categorical': np.repeat([0, 1, 2, 3, 4], N / 5)
})
CATEGORICAL_EFFECTS = [-1, -1, -2, -2, 2]
LINEAR_TERM = np.array([
    -0.5 + 0.01 * X['continuous'][k]
    + CATEGORICAL_EFFECTS[X['categorical'][k]] for k in range(X.shape[0])
]) + np.random.normal(0, 1, X.shape[0])
TRUE_PROB = expit(LINEAR_TERM)
Y = np.random.binomial(1, TRUE_PROB, size=N)
#################

run = npt.init(mode='debug')
callback = LGBMNeptuneCallback(run)
callbacks = [callback]
model = lgb.LGBMClassifier()
model.fit(X, Y, callbacks=callbacks)

This fails with

TypeError: cannot pickle '_io.TextIOWrapper' object

Environment info

LightGBM version: 3.3.0

Additional Comments

neptune-lightgbm version: 0.9.13
neptune-client version: 0.10.10

The reason why this happens is because of using a deepcopy for callbacks:

LightGBM/python-package/lightgbm/sklearn.py

Line 733 in fa4ecf4

callbacks = copy.deepcopy(callbacks)

Indeed, the error is the same for the code below:

import copy
copy.deepcopy(callbacks)

The text was updated successfully, but these errors were encountered:

jameslamb · 2021-10-26T14:03:01Z

Thanks for the report! Just a small note, I've modified your initial description to include a link that is anchored to a specific commit. That way, that link will continue to point to the code you're talking about, even if the file is changed.

In case you're not familiar with this feature of GitHub, you can see https://docs.github.com/en/repositories/working-with-files/using-files/getting-permanent-links-to-files#press-y-to-permalink-to-a-file-in-a-specific-commit.

aptlin · 2021-10-26T14:04:26Z

Thanks! It was pinned to the v3.3.0 tag, so that should have been fine

jameslamb · 2021-10-26T14:15:09Z

It was pinned to the v3.3.0 tag, so that should have been fine

Ah, sorry! I probably just misread that. 😂

changes in handling the callbacks in v3.3.0

Looking at the blame, I can see the change you're talking about came from #4574, specifically.

I'm not sure if we've ever discussed this question:

do all callbacks need to be serializable?

Based on the usage I see in lightgbm.train(), I don't think they're stored in the Booster object and therefore wouldn't be persisted if you pickle that object:

LightGBM/python-package/lightgbm/engine.py

Line 229 in 717f037

# process callbacks

That copy.deepcopy() performed in the point you've linked to (added in #4574) is there because lightgbm appends some other callback functions internally, and we wanted to avoid having side-effects on the user-provided list. I think that's desirable behavior.

Can you think of a way to achieve both the behavior "lightgbm training does not have side effects on the list of callbacks you pass in" AND "lightgbm training can accept items in callbacks that are not serializable"?

If not, maybe the solution here is a change in neptune instead of lightgbm.

aptlin · 2021-10-26T14:45:08Z

Sounds strange why we should limit the callbacks to be serializable objects. What's the reasoning for that other than avoiding side effects (which are desirable given the Neptune integration, since we want to send metrics during the training process)?

aptlin · 2021-10-26T14:50:40Z

Tagging neptune integration contributors who might be interested in this issue.
cc: @shnela @PiotrJander

StrikerRUS · 2021-10-26T15:45:59Z

Actually, we deepcopy callbacks because internally some deprecated args are transformed into callbacks (this will be removed in the future) and we append one more callback

LightGBM/python-package/lightgbm/sklearn.py

Lines 745 to 746 in fa4ecf4

    
           evals_result = {} 
        
           callbacks.append(record_evaluation(evals_result))

Given the list mutability it is unacceptable to append to the originally provided callback argument. Also, there can be some critical crashes in the sklearn-wrapper specifically due to this reason. For example, refer to

LightGBM/python-package/lightgbm/sklearn.py

Lines 628 to 630 in fa4ecf4

    
           # Do not modify original args in fit function 
        
           # Refer to https://github.com/microsoft/LightGBM/pull/2619 
        
           eval_metric_list = copy.deepcopy(eval_metric)

#2610.

But I guess (requires more investigation) that for this particular case with callback shallow copy will be enough. @aptlin Will this fit your (Neptune's) needs?

aptlin · 2021-10-26T15:52:03Z

Yes, thanks! The only critical thing was just passing the neptune integration as one of the callbacks.

copy.copy(callbacks) with callbacks as in the example I gave works fine

StrikerRUS · 2021-10-26T16:40:50Z

@jameslamb Don't you mind including this into the upcoming 3.3.1 bug fixing release?

jameslamb · 2021-10-26T17:09:54Z

Don't you mind including this into the upcoming 3.3.1 bug fixing release

@StrikerRUS sure, seems ok to me, as long as it can be done quickly. I don't want to delay too long, since I think getting the R package back up on CRAN should be treated as a fairly urgent concern.

Actually, we deepcopy callbacks because ... we append one more callback
Given the list mutability it is unacceptable to append

To be clear, this is what I meant by "wanted to avoid having side-effects on the user-provided list". Thanks for adding a bit more detail there in case my language wasn't clear!

aptlin · 2021-10-26T17:12:29Z

Right, so I think all it would take is just replacing deepcopy with copy here

LightGBM/python-package/lightgbm/sklearn.py

Line 733 in fa4ecf4

callbacks = copy.deepcopy(callbacks)

@StrikerRUS will you create a PR please?

StrikerRUS · 2021-10-26T17:21:20Z

Yeah, I can do this today.

github-actions · 2023-08-23T14:09:14Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

jameslamb changed the title ~~Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0~~ [python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 Oct 26, 2021

jameslamb added the bug label Oct 26, 2021

jameslamb added question and removed bug labels Oct 26, 2021

This was referenced Oct 26, 2021

release v3.3.1 #4715

Merged

[python][sklearn] Allow non-serializable objects in callbacks argument #4723

Merged

StrikerRUS closed this as completed in #4723 Oct 27, 2021

StrikerRUS mentioned this issue Feb 16, 2022

[python] make early_stopping callback pickleable #5012

Merged

StrikerRUS mentioned this issue Mar 17, 2022

[python-package] ensure that all callbacks are pickleable #5080

Closed

4 tasks

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

aptlin commented Oct 26, 2021 •

edited by jameslamb

Loading

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021 •

edited

Loading

aptlin commented Oct 26, 2021 •

edited

Loading

StrikerRUS commented Oct 26, 2021

aptlin commented Oct 26, 2021

StrikerRUS commented Oct 26, 2021

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021

StrikerRUS commented Oct 26, 2021

github-actions bot commented Aug 23, 2023

[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

[python] Neptune LightGBM callback yields an error with the new logic introduced in v3.3.0 #4719

Comments

aptlin commented Oct 26, 2021 • edited by jameslamb Loading

Description

Reproducible example

Environment info

Additional Comments

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021 • edited Loading

aptlin commented Oct 26, 2021 • edited Loading

StrikerRUS commented Oct 26, 2021

aptlin commented Oct 26, 2021

StrikerRUS commented Oct 26, 2021

jameslamb commented Oct 26, 2021

aptlin commented Oct 26, 2021

StrikerRUS commented Oct 26, 2021

github-actions bot commented Aug 23, 2023

aptlin commented Oct 26, 2021 •

edited by jameslamb

Loading

aptlin commented Oct 26, 2021 •

edited

Loading

aptlin commented Oct 26, 2021 •

edited

Loading