[ENH] dummy supervised regressor with polars support #440

julian-fong · 2024-08-01T21:25:54Z

Implement the DummyProbaRegressor but with complete end to end support in skpro.

Some current limitations:

fit inside DummyProbaRegressor uses skpro.distributions which only supports pandas dataframes - needs a workaround

predict_proba also uses skpro.distributions - leading to the same issue, will need a workaround as well

@fkiraly any suggestions on how to implement?

The text was updated successfully, but these errors were encountered:

julian-fong · 2024-08-11T13:23:28Z

@fkiraly I've come into a problem with the current implementation for polars support in skpro.

if an estimator specifies

"X_inner_mtype": "polars_eager_table",
"y_inner_mtype": "polars_eager_table",

Then during the tests, pandas DataFrames will get converted into polars dataframes via check_X in the boilerplate code in regression.base but they will lose their index

Since the index is already lost via the boilerplate code check_X, it is not retrievable when calling the private methods (since the input is already in polars dataframe format without the index). This will then fail subsequent index asserts in test files after the DataFrame is converted back into a pandas DataFrame via the convert function.

fkiraly · 2024-08-12T06:50:53Z

Interesting - I thought it saved the index as a variable __index__ if it was not a range index.

Or, is that only in the sktime implementation by @pranavvp16 ?

julian-fong · 2024-08-12T11:48:30Z

I think that would be in the sktime implementation, we do not save the index anywhere currently in the boilerplate if the incoming mtype is in polars format

fkiraly · 2024-08-13T11:29:46Z

May I suggest to try syncing the two implementations? I think the sktime type by @pranavvp16 stores non-range index as a reserved variable.

adds index support as part of #440 and is used to sync up polars conversion utilities between skpro and sktime. Correponding sktime pr for polars conversion utilities is sktime/sktime#6455. In this pr: If a pandas Dataframe is a `from_type` and polars frame is a `to_type` then during the conversion, we will save the index (assumed never to be in multi-index format) and insert it as an individual column with column name `__index__`. Then the resulting pandas dataframe will be converted to a polars dataframe. In the inverse function, if we are converting from polars dataframe to pandas dataframe, if the column `__index__` exists in the pandas dataframe post-conversion, then we will map that column to the index before returning the pandas Dataframe After this is merged, #447 will be implemented as a `polars` only estimator. tests will also be written to check polars input end to end and pandas input and output through the polars estimator (i.e pandas input into polars estimator -> polars predictions -> pandas output)

julian-fong added the feature request New feature or request label Aug 1, 2024

fkiraly added the module:regression probabilistic regression module label Aug 2, 2024

julian-fong mentioned this issue Aug 3, 2024

[ENH] Add polars version of dummy proba regressor #447

Open

julian-fong mentioned this issue Aug 14, 2024

[ENH] Polars adapter enhancements #449

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] dummy supervised regressor with polars support #440

[ENH] dummy supervised regressor with polars support #440

julian-fong commented Aug 1, 2024

julian-fong commented Aug 11, 2024 •

edited

Loading

fkiraly commented Aug 12, 2024 •

edited

Loading

julian-fong commented Aug 12, 2024

fkiraly commented Aug 13, 2024

[ENH] dummy supervised regressor with polars support #440

[ENH] dummy supervised regressor with polars support #440

Comments

julian-fong commented Aug 1, 2024

julian-fong commented Aug 11, 2024 • edited Loading

fkiraly commented Aug 12, 2024 • edited Loading

julian-fong commented Aug 12, 2024

fkiraly commented Aug 13, 2024

julian-fong commented Aug 11, 2024 •

edited

Loading

fkiraly commented Aug 12, 2024 •

edited

Loading