-
Notifications
You must be signed in to change notification settings - Fork 922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add public APIs to Access Underlying cudf
and pandas
Objects from cudf.pandas
Proxy Objects
#17629
Changes from 21 commits
b5eea1f
7bc76e5
3cdfe94
34375dc
31f9e99
72ba73f
3fd679f
37764c2
bbe0fa4
cf6888f
528c189
6b744f4
9ec0215
a3c49fd
3f06e70
95ba799
a2e97f5
b96f8ff
0044d8f
a77fecc
aafbd31
f15d37b
f94253d
daaa5df
eaa23cb
9c58a71
69837f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2024, NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES. | ||
# All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
|
@@ -176,3 +176,7 @@ def ndarray__array_ufunc__(self, ufunc, method, *inputs, **kwargs): | |
cupy._core.flags.Flags, | ||
_numpy_flagsobj, | ||
) | ||
|
||
|
||
def is_cudf_pandas_ndarray(obj): | ||
return is_proxy_object(obj) and isinstance(obj, ndarray) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we remove this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. removed. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2024, NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES. | ||
# All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
import abc | ||
|
@@ -35,7 +35,9 @@ | |
_fast_slow_function_call, | ||
_FastSlowAttribute, | ||
_FunctionProxy, | ||
_maybe_wrap_result, | ||
_Unusable, | ||
is_proxy_object, | ||
make_final_proxy_type as _make_final_proxy_type, | ||
make_intermediate_proxy_type as _make_intermediate_proxy_type, | ||
register_proxy_func, | ||
|
@@ -266,6 +268,12 @@ def custom_repr_html(obj): | |
html_formatter.for_type(DataFrame, custom_repr_html) | ||
|
||
|
||
def _Series_dtype(self): | ||
# Fast-path to extract dtype from the current | ||
# object without round-tripping through the slow<->fast | ||
return _maybe_wrap_result(self._fsproxy_wrapped.dtype, None) | ||
|
||
|
||
Series = make_final_proxy_type( | ||
"Series", | ||
cudf.Series, | ||
|
@@ -285,6 +293,7 @@ def custom_repr_html(obj): | |
"_constructor": _FastSlowAttribute("_constructor"), | ||
"_constructor_expanddim": _FastSlowAttribute("_constructor_expanddim"), | ||
"_accessors": set(), | ||
"dtype": _Series_dtype, | ||
}, | ||
) | ||
|
||
|
@@ -1704,6 +1713,10 @@ def holiday_calendar_factory_wrapper(*args, **kwargs): | |
) | ||
|
||
|
||
def isinstance_cudf_pandas(obj, type_name): | ||
return is_proxy_object(obj) and obj.__class__.__name__ == type_name | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the second half of this sufficiently precise? Can we rely on names alone? I would like to use some kind of "real" check like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think allowing real types will make this API more complex and confusing. We don't want to allow xgboost has this string based checks: https://github.com/dmlc/xgboost/pull/11014/files#diff-bf11fe0b3133c5b20253ea67b82c3e576513c8079f3be355fc323e3e903d989cR852 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typically string-based checks are used as a way to avoid requiring an import of that package. We could still accept a class in the API and do a string-based check of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the api to use this approach. 👍 |
||
|
||
# timestamps and timedeltas are not proxied, but non-proxied | ||
# pandas types are currently not picklable. Thus, we define | ||
# custom reducer/unpicker functions for these types: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2024, NVIDIA CORPORATION & AFFILIATES. | ||
# SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES. | ||
# All rights reserved. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
|
@@ -204,6 +204,12 @@ def _fsproxy_fast_to_slow(self): | |
return fast_to_slow(self._fsproxy_wrapped) | ||
return self._fsproxy_wrapped | ||
|
||
def as_gpu_object(self): | ||
return self._fsproxy_slow_to_fast() | ||
|
||
Comment on lines
+207
to
+208
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it make sense to call |
||
def as_cpu_object(self): | ||
return self._fsproxy_fast_to_slow() | ||
|
||
@property # type: ignore | ||
def _fsproxy_state(self) -> _State: | ||
return ( | ||
|
@@ -221,6 +227,8 @@ def _fsproxy_state(self) -> _State: | |
"_fsproxy_slow_type": slow_type, | ||
"_fsproxy_slow_to_fast": _fsproxy_slow_to_fast, | ||
"_fsproxy_fast_to_slow": _fsproxy_fast_to_slow, | ||
"as_gpu_object": as_gpu_object, | ||
"as_cpu_object": as_cpu_object, | ||
"_fsproxy_state": _fsproxy_state, | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally
isinstance
takes a class, not a string. Is that an option here, too?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
responded here; #17629 (comment)