-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate cudf.isclose #17351
Deprecate cudf.isclose #17351
Conversation
This is a prototype implementation of rapidsai/build-infra#139 The work that this builds on: * rapidsai/gha-tools#118, which adds a shell wrapper that automatically creates spans for the commands that it wraps. It also uses the `opentelemetry-instrument` command to set up monkeypatching for supported Python libraries, if the command is python-based * https://github.com/rapidsai/shared-workflows/tree/add-telemetry, which installs the gha-tools work from above and sets necessary environment variables. This is only done for the conda-cpp-build.yaml shared workflow at the time of submitting this PR. The goal of this PR is to observe telemetry data sent from a GitHub Actions build triggered by this PR as a proof of concept. Once it all works, the remaining work is: * merge rapidsai/gha-tools#118 * Move the opentelemetry-related install stuff in https://github.com/rapidsai/shared-workflows/compare/add-telemetry?expand=1#diff-ca6188672785b5d214aaac2bf77ce0528a48481b2a16b35aeb78ea877b2567bcR118-R125 into https://github.com/rapidsai/ci-imgs, and rebuild ci-imgs * expand coverage to other shared workflows * Incorporate the changes from this PR to other jobs and to other repos Authors: - Mike Sarahan (https://github.com/msarahan) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#16924
Updates cmake to 3.28.6 in the JNI Dockerfile used to build the cudf jar. This helps avoid a bug in older cmake where FindCUDAToolkit can fail to find cufile libraries. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Nghia Truong (https://github.com/ttnghia) - Gera Shegalov (https://github.com/gerashegalov) URL: rapidsai#17342
Apart of rapidsai#15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#17246
@@ -5,17 +5,17 @@ | |||
"args": { | |||
"CUDA": "11.8", | |||
"PYTHON_PACKAGE_MANAGER": "conda", | |||
"BASE": "rapidsai/devcontainers:24.12-cpp-cuda11.8-mambaforge-ubuntu22.04" | |||
"BASE": "rapidsai/devcontainers:25.02-cpp-cuda11.8-mambaforge-ubuntu22.04" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR targets 24.12. Why have all the dependencies been changed to 25.02?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR's title, name, description, and actual changes do not make sense to me.
Recommend closing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be an attempt at #13593, an issue I filed a while back. @the-gates-of-Zion I can help work with you on this, if you are interested. This should target branch-25.02
, so I will retarget the PR.
@@ -5322,6 +5322,29 @@ def isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False): | |||
5 False | |||
dtype: bool | |||
""" | |||
warnings.warn( | |||
"`cudf.close` is deprecated and will be removed in a future version of cudf. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"`cudf.close` is deprecated and will be removed in a future version of cudf. " | |
"`cudf.isclose` is deprecated and will be removed in a future version of cudf. " |
''' | ||
import cupy as cp | ||
import pandas as pd | ||
from cudf.core.column import ( | ||
as_column, | ||
) | ||
|
||
a = pd.array([1.0, 2.0, None]) | ||
b = pd.array([1.0, 2.1, None]) | ||
|
||
a_col = as_column(a) | ||
a_array = cupy.asarray(a_col.data_array_view(mode="read")) | ||
|
||
b_col = as_column(b) | ||
b_array = cupy.asarray(b_col.data_array_view(mode="read")) | ||
|
||
result = cp.isclose(a, b, equal_nan=True) | ||
print(result) # Output: [ True False True] | ||
''', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We noted in #13593: "As part of the deprecation, we decided to add a warning message indicating how to support nulls while using cupy.isclose." However, I am not sure if this is the right code snippet.
The code we want to give to the user as a replacement for this deprecated code path is essentially the current implementation in this method. Basically it has two steps:
- Call
cupy.isclose
to see if the values are "close" to each other. This does not account for null values in the input. - Compare the null masks of each input. If
equal_nan
isFalse
, we mark all null values asFalse
("not close") in the result. Ifequal_nan
isTrue
, we set the values where one column's input is null asFalse
and the values where both columns' input are null asTrue
.
I would suggest we add this snippet to the page "Working with missing data": https://docs.rapids.ai/api/cudf/stable/user_guide/missing-data/
Then we can link to this snippet in the docs from the deprecated method's docstring and in the deprecation warning.
Does that make sense? Let me know if you have questions.
Moves `cpp/benchmarks/string/translate.cpp` implementation from google-bench to nvbench. This is benchmark for the `cudf::strings::translate` API. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#17325
Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: rapidsai#17318
…, ...)` type (rapidsai#17604) From an offline discussion, a pandas object with an `category[interval[...]]` type would be incorrectly be interpreted as a `category[struct[...]]` type. This can cause further problems with `cudf.pandas` as a `category[struct[...]]` type cannot be properly interpreted by pandas. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17604
) Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17456
…sai#17460) Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17460
Clang-tidy does not like `[[nodiscard]]` after `__device__` and I don't like red squigly lines. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: rapidsai#17608
Recent changes in dask and dask-expr have broken `dask_cudf.read_csv` (dask/dask-expr#1178, dask/dask#11603). Fortunately, the breaking changes help us avoid legacy CSV code in the long run. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#17612
Apart of rapidsai#17565 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: rapidsai#17599
…i#17611) A recent nightly failure discovered by @davidwendt here: https://github.com/rapidsai/cudf/actions/runs/12367794950/job/34543121050 indicates an environment cannot be created with `pytorch>=2.4.0` and `pyarrow==14.0.0 & 14.0.1`. Thus this bump to `14.0.2`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17611
This PR has two fixes: - Since we're pinning to a commit, a shallow clone will start failing as soon as HEAD gets bumped on the main branch (which will happen next when cuml/raft logging features are merged). We need to stop using shallow clones. - The CMake code for setting the default logging levels was setting the wrong macro name. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17588
…apidsai#17610) Fixes memcheck error found in nightly build checks in the STREAM_REPLACE_TEST's `ReplaceTest.NormalizeNansAndZerosMutable` gtest. The mutable-view passed to the `cudf::normalize_nans_and_zeros` API was pointing to invalidated data. The following line created the invalid view ``` cudf::mutable_column_view mutable_view = cudf::column(input, cudf::test::get_default_stream()); ``` The temporary `cudf::column` is destroyed once the `mutable_view` is created so this view would now point to a freed column. The view must be created from a non-temporary column and also must be non-temporary itself so that it is not implicitly converted to a `column_view`. Error introduced by rapidsai#17436 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) URL: rapidsai#17610
Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17555
Return a `cudf::detail::host_vector` from `flatten_single_pass_aggs` because this vector is eventually copied to the device and we might want to use pinned memory. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Karthikeyan (https://github.com/karthikeyann) - MithunR (https://github.com/mythrocks) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17605
…apidsai#17618) We stopped running clang-tidy on test files in rapidsai#17078 so no reason to carry around these patches anymore. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17618
Apart of rapidsai#17565 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: rapidsai#17606
Simplifies telemetry a bit. More details at rapidsai/shared-actions#28. Telemetry will still not be collected until @ajschmidt8 enables the TELEMETRY_ENABLED environment variable for this repo. Authors: - Mike Sarahan (https://github.com/msarahan) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17615
…com/enoch-gives-thanks/cudf into pull-request/rapidsai#13593" This reverts commit d27c5e5, reversing changes made to c418cb2.
Closing the PR due to I accidentally rewrote the history of the previous commits. |
Description
This PR adds deprecation warning to
cudf.isclose
and direct user to an alternative solution.closes #13593
Checklist