Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support to specify categorical features in lgbm learner #197

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Commits on May 20, 2022

  1. Adds support to specify categorical features in lgbm learner

    LightGBM can offer a good accuracy when using native categorical features.
    Not like simply one-hot coding, LightGBM can find the optimal split of categorical features.
    Such an optimal split can provide the much better accuracy than one-hot coding solution.
    
    You can learn about this option in:
    https://github.com/microsoft/LightGBM/blob/master/docs/Advanced-Topics.rst#categorical-feature-support
    https://github.com/Microsoft/LightGBM/blob/v3.3.1/docs/Parameters.rst
    fberanizo committed May 20, 2022
    Configuration menu
    Copy the full SHA
    639c8a9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c3f221f View commit details
    Browse the repository at this point in the history
  3. Fix typing annotation of categorical_features

    It is a Union[List[str], str]
    fberanizo committed May 20, 2022
    Configuration menu
    Copy the full SHA
    bf4dc15 View commit details
    Browse the repository at this point in the history
  4. Changes lgbm.Dataset source to a pandas.DataFrame

    Previously, the source was the underlying numpy.array, but in order
    to allow categorical_feature='auto' we need to pass a DataFrame.
    fberanizo committed May 20, 2022
    Configuration menu
    Copy the full SHA
    4556c80 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    020b1d3 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    e8a16e3 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2022

  1. Removes occurrences of DataFrame.values (ndarray)

    Uses the DataFrame everywhere it's possible.
    fberanizo committed Aug 29, 2022
    Configuration menu
    Copy the full SHA
    03e84e0 View commit details
    Browse the repository at this point in the history
  2. Applies the changes to lgbm Regressor

    Also adds a unittest.
    fberanizo committed Aug 29, 2022
    Configuration menu
    Copy the full SHA
    5a498d9 View commit details
    Browse the repository at this point in the history