Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RandomForestRegressor to_json functionality broken #9

Open
jelc53 opened this issue Aug 24, 2022 · 3 comments
Open

[BUG] RandomForestRegressor to_json functionality broken #9

jelc53 opened this issue Aug 24, 2022 · 3 comments

Comments

@jelc53
Copy link

jelc53 commented Aug 24, 2022

  File "<string>", line 1, in <module>
  File "/virtualenvs/smartshift-load-forecasting-9TtSrW0h-py3.9/lib/python3.9/site-packages/sklearn_json/__init__.py", line 122, in to_json
    json.dump(serialize_model(model), model_json)
  File "/virtualenvs/smartshift-load-forecasting-9TtSrW0h-py3.9/lib/python3.9/site-packages/sklearn_json/__init__.py", line 57, in serialize_model
    return reg.serialize_random_forest_regressor(model)
  File "/virtualenvs/smartshift-load-forecasting-9TtSrW0h-py3.9/lib/python3.9/site-packages/sklearn_json/regression.py", line 307, in serialize_random_forest_regressor
    'min_impurity_split': model.min_impurity_split,
AttributeError: 'RandomForestRegressor' object has no attribute 'min_impurity_split'```
@shon-otmazgin
Copy link

shon-otmazgin commented Aug 31, 2022

I have t same issue. Did you find any workaround?

@jamesheaton
Copy link

Also getting the same error, I've not found any workarounds

@ZumelzuR
Copy link

ZumelzuR commented Jun 13, 2023

as you not use min_impurity_split in last versions of sklearn, you should comment on the code of the library.

Also you should do replace n_features by n_features_in_, and check the oficial documentation to check which values work for your version.

My changes was

def deserialize_random_forest(model_dict):
    model = RandomForestClassifier(**model_dict['params'])
    estimators = [deserialize_decision_tree(decision_tree) for decision_tree in model_dict['estimators_']]
    model.estimators_ = np.array(estimators)

    model.classes_ = np.array(model_dict['classes_'])
    model.n_features_ = model_dict['n_features_']
    model.n_outputs_ = model_dict['n_outputs_']
    model.max_depth = model_dict['max_depth']
    model.min_samples_split = model_dict['min_samples_split']
    model.min_samples_leaf = model_dict['min_samples_leaf']
    model.min_weight_fraction_leaf = model_dict['min_weight_fraction_leaf']
    model.max_features = model_dict['max_features']
    model.max_leaf_nodes = model_dict['max_leaf_nodes']
    model.min_impurity_decrease = model_dict['min_impurity_decrease']
    # model.min_impurity_split = model_dict['min_impurity_split']

and

def serialize_random_forest(model):
    serialized_model = {
        'meta': 'rf',
        'max_depth': model.max_depth,
        'min_samples_split': model.min_samples_split,
        'min_samples_leaf': model.min_samples_leaf,
        'min_weight_fraction_leaf': model.min_weight_fraction_leaf,
        'max_features': model.max_features,
        'max_leaf_nodes': model.max_leaf_nodes,
        'min_impurity_decrease': model.min_impurity_decrease,
        # 'min_impurity_split': model.min_impurity_split,
        'n_features_in_': model.n_features_in_,
        'n_features': model.n_features_in_,
        'n_outputs_': model.n_outputs_,
        'classes_': model.classes_.tolist(),
        'estimators_': [serialize_decision_tree(decision_tree) for decision_tree in model.estimators_],
        'params': model.get_params()
    }

and

def serialize_decision_tree(model):
    print(model)
    tree, dtypes = serialize_tree(model.tree_)
    serialized_model = {
        'meta': 'decision-tree',
        'feature_importances_': model.feature_importances_.tolist(),
        'max_features_': model.max_features_,
        'n_classes_': int(model.n_classes_),
        'n_features_in_': model.n_features_in_,
        'n_features_': model.n_features_in_,
        'n_outputs_': model.n_outputs_,
        'tree_': tree,
        'classes_': model.classes_.tolist(),
        'params': model.get_params()
    }

If I have some time and this work good for me I will do a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants