You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
polars.Series accepts only polars datatypes as value for the dtype argument, while pandas can take the datatype as string. As a result, testing fails if I use all_null_like and set the dtype.
This isn't checked in test_common.py either.
Steps/Code to Reproduce
importpolarsasplimportpandasaspdimportskrub._dataframeassbdcol=pd.Series([1,2,3])
# this workssbd.all_null_like(col, dtype="float32")
col_pl=pl.from_pandas(col)
# this doesn'tsbd.all_null_like(col_pl, dtype="float32")
Expected Results
0 NaN
1 NaN
2 NaN
dtype: float32
Actual Results
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
File /home/rcappuzz/Projects/skrub/bug_allnull.py:3
[1](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/bug_allnull.py:1) # %%
[2](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/bug_allnull.py:2) col_pl = pl.from_pandas(col)
----> [3](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/bug_allnull.py:3) sdb.all_null_like(col_pl, dtype="float32")
File ~/.local/share/uv/python/cpython-3.10.15-linux-x86_64-gnu/lib/python3.10/functools.py:889, in singledispatch.<locals>.wrapper(*args, **kw)
[885](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/.local/share/uv/python/cpython-3.10.15-linux-x86_64-gnu/lib/python3.10/functools.py:885) if not args:
[886](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/.local/share/uv/python/cpython-3.10.15-linux-x86_64-gnu/lib/python3.10/functools.py:886) raise TypeError(f'{funcname} requires at least '
[887](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/.local/share/uv/python/cpython-3.10.15-linux-x86_64-gnu/lib/python3.10/functools.py:887) '1 positional argument')
--> [889](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/.local/share/uv/python/cpython-3.10.15-linux-x86_64-gnu/lib/python3.10/functools.py:889) return dispatch(args[0].__class__)(*args, **kw)
File ~/Projects/skrub/skrub/_dataframe/_common.py:323, in _all_null_like_polars(col, length, dtype, name)
[321](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrub/_dataframe/_common.py:321) if name is None:
[322](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrub/_dataframe/_common.py:322) name = col.name
--> [323](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrub/_dataframe/_common.py:323) return pl.Series(name, [None] * length, dtype=dtype)
File ~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:272, in Series.__init__(self, name, values, dtype, strict, nan_to_null)
[270](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:270) dtype = None
[271](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:271) elif dtype is not None and not is_polars_dtype(dtype):
--> [272](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:272) dtype = parse_into_dtype(dtype)
[274](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:274) # Handle case where values are passed as the first argument
[275](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/series/series.py:275) original_name: str | None = None
File ~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:57, in parse_into_dtype(input)
[55](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:55) return _parse_union_type_into_dtype(input)
[56](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:56) else:
---> [57](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:57) return parse_py_type_into_dtype(input)
File ~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:103, in parse_py_type_into_dtype(input)
[101](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:101) return _parse_generic_into_dtype(input)
[102](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:102) else:
--> [103](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:103) _raise_on_invalid_dtype(input)
File ~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:181, in _raise_on_invalid_dtype(input)
[179](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:179) input_detail = "" if type(input) is type else f" (given: {input!r})"
[180](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:180) msg = f"cannot parse input {input_type} into Polars data type{input_detail}"
--> [181](https://file+.vscode-resource.vscode-cdn.net/home/rcappuzz/Projects/skrub/~/Projects/skrub/skrenv/lib/python3.10/site-packages/polars/datatypes/_parse.py:181) raise TypeError(msg) from None
TypeError: cannot parse input of type 'str' into Polars data type (given: 'float32')
I tried a few things to fix the issue, but what fixes one problem breaks another. I wonder if it would be easier to have some function in _common.py that returns/casts based on the string rather than on the specific dtype, similar to what is done in the conftest for the df_module 🤔
Describe the bug
polars.Series
accepts only polars datatypes as value for thedtype
argument, while pandas can take the datatype as string. As a result, testing fails if I useall_null_like
and set thedtype
.This isn't checked in
test_common.py
either.Steps/Code to Reproduce
Expected Results
Actual Results
Versions
The text was updated successfully, but these errors were encountered: