Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inf/nan values in model are silently converted to zero when model is round-tripped to file #3941

Closed
mjmckp opened this issue Feb 12, 2021 · 2 comments
Labels

Comments

@mjmckp
Copy link
Contributor

mjmckp commented Feb 12, 2021

When compiled with VS2017, a round trip of a LightGBM model to string (or file) via the GBDT::SaveModelToString and GBDT::LoadModelFromString functions silently replaces all inf/nan values in the model with zeros.

This happens because the GBDT::LoadModelFromString calls CommonC::StringToArray<double> (see below, from line 1105 of utils\common.h) to convert the strings containing various parts of the model (such as the thresholds etc):

template<typename T>
struct __StringToTHelper<T, true> {
  T operator()(const std::string& str) const {
    double tmp;

    // Fast (common) path: For numeric inputs in RFC 7159 format:
    const bool fast_parse_succeeded = fast_double_parser::parse_number(str.c_str(), &tmp);

    // Rare path: Not in RFC 7159 format. Possible "inf", "nan", etc. Fallback to standard library:
    if (!fast_parse_succeeded) {
      std::stringstream ss;
      Common::C_stringstream(ss);
      ss << str;
      ss >> tmp;
    }

    return static_cast<T>(tmp);
  }
};

When the input to this method is "inf", "-inf" or "nan", the second half of the method is meant to handle these cases, but in fact it actually silently sets the return value tmp to zero.

This bug appears to have been introduced in 792c930 in Dec 2020.

Reproduction

  1. Calibrate a LightGBM model which contains a non-finite value (e.g., inf) in one of the trees
  2. Save the model to file
  3. Load the model from file
  4. Inspect the loaded model and observe all the non-finite values have been replaced by zero

Environment info

Windows 10, Visual Studio 2017.

@StrikerRUS
Copy link
Collaborator

Fixed via #3942.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants