Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results for pmml and LightGBM #301

Closed
artakhm opened this issue Sep 15, 2021 · 1 comment
Closed

Different results for pmml and LightGBM #301

artakhm opened this issue Sep 15, 2021 · 1 comment

Comments

@artakhm
Copy link

artakhm commented Sep 15, 2021

Hi,

It seems that having missing values makes pmml answers diverge from original LightGBM model predictions. Even though PMMLpipeline in python still makes the same predictions as model. There definately were missing values in training data, so I guess that #297 is not the case. I found out that prediction differs for objects with missing value in 'feat32' column

The pmml file was generated like this:

score_mapper = DataFrameMapper([([feat] , [ContinuousDomain(invalid_value_treatment='as_is')]) for feat in feats], input_df=True, df_out=True)
score_mapper.fit(df_tmp)

mapper = ColumnTransformer([
    (feature, Alias(ExpressionTransformer('X[0]'), feature+'_trans'), [feature]) for feature in feats[:-1]
])
mapper.fit(score_mapper.transform(df_tmp))

pipe_boost = PMMLPipeline([('mapper', mapper),('boosting', boost)])

steps = [
    ('seg', pipe_boost, 'X[35]==1')
]

pipe_conditional = PMMLPipeline([('mapper', score_mapper),
                                     ('conditional_pipeline',
                                      SelectFirstClassifier(steps))])

sklearn2pmml(pipe_conditional, 'model_file.pmml')

I run the pmml file with
java -jar pmml-evaluator-example-executable.jar --model mode_file.pmml --input data.csv --output output.csv

I've attached example of the data and the model. The 'model' columns in data stands for original LightGBM predictions
model.zip
data.csv

@vruusmann
Copy link
Member

Closing as a very likely duplicate of jpmml/jpmml-lightgbm#51

What's the fuss about LightGBM and missing values in these days? Full moon or something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants