-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Python 3.12, update Kafka dependencies to 2.5.x #16745
Conversation
@@ -39,7 +39,7 @@ requirements: | |||
- python | |||
- rapids-build-backend >=0.3.0,<0.4.0.dev0 | |||
- setuptools | |||
- python-confluent-kafka >=1.9.0,<1.10.0a0 | |||
- python-confluent-kafka >=2.5.0,<2.6.0a0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There aren't any python-confluent-kafka
releases in the 1.x series that have Python 3.12 packages on conda-forge (https://anaconda.org/conda-forge/python-confluent-kafka).
Resulting in build errors like this:
LibMambaUnsatisfiableError: Encountered problems while solving:
- package python-confluent-kafka-1.9.0-py310h5764c6d_0 requires python_abi 3.10.* *_cp310, but none of the providers can be installed
Could not solve for environment specs
The following packages are incompatible
├─ python-confluent-kafka >=1.9.0,<1.10.0a0 is installable with the potential options
│ ├─ python-confluent-kafka [1.9.0|1.9.2] would require
│ │ └─ python_abi 3.10.* *_cp310, which can be installed;
│ ├─ python-confluent-kafka [1.9.0|1.9.2] would require
│ │ └─ python_abi 3.7.* *_cp37m, which can be installed;
│ ├─ python-confluent-kafka [1.9.0|1.9.2] would require
│ │ └─ python_abi 3.8.* *_cp38, which can be installed;
│ ├─ python-confluent-kafka [1.9.0|1.9.2] would require
│ │ └─ python_abi 3.9.* *_cp39, which can be installed;
│ └─ python-confluent-kafka 1.9.2 would require
│ └─ python_abi 3.11.* *_cp311, which can be installed;
└─ python 3.12** is not installable because there are no viable options
├─ python [3.12.0|3.12.1|...|3.12.5] would require
│ └─ python_abi 3.12.* *_cp312, which conflicts with any installable versions previously reported;
└─ python 3.12.0rc3 would require
└─ _python_rc, which does not exist (perhaps a missing channel).
Pushed a commit (8c206d7) which updates python-confluent-kafka
and librdkafka
to the latest minor version (the 2.5.x series) and continues the practice to pinning build and runtime dependencies within a single minor version of Kafka libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the Kafka dependencies is fine with me.
cc: @jdye64 for awareness.
@@ -120,7 +120,7 @@ def visit_Name(self, node): | |||
self.stack.append(ColumnReference(col_id)) | |||
|
|||
def visit_Constant(self, node): | |||
if not isinstance(node, (ast.Num, ast.Str)): | |||
if not isinstance(node.value, (float, int, str, complex)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of ast.Num
here resulted in a deprecation warning that was treated as an error in tests (thanks @mroeschke !).
DeprecationWarning: ast.Num is deprecated and will be removed in Python 3.14; use ast.Constant instead
I looked into this... seems ast.Num
and other specialized constants have been deprecated since Python 3.8 and are set to be removed in Python 3.14.
This proposes checking the value of the ast.Constant
being passed in here against a hard-coded set of builtins.
- more details on that deprecation: https://github.com/python/cpython/blob/aa1339aaaa6363c38186defaa079d069b4cb08b2/Doc/whatsnew/3.8.rst?plain=1#L1668-L1671
- evidence that this list of builtins is identical to what
ast.Num
andast.Str
were doing: https://github.com/python/cpython/blob/a50a86db0fd66c7ea53e56e346057be1153d6d47/Lib/ast.py#L528-L534
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me. Thanks.
@@ -2359,7 +2359,7 @@ def to_dict( | |||
You can also specify the mapping type. | |||
|
|||
>>> from collections import OrderedDict, defaultdict | |||
>>> df.to_dict(into=OrderedDict) | |||
>>> df.to_dict(into=OrderedDict) # doctest: +SKIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting with Python 3.12, the repr()
representation for an OrderedDict
has changed:
- OrderedDict dict/list representation python/cpython#101446
- https://docs.python.org/3/whatsnew/changelog.html
(thanks @bdice for pointing me to that changelog item!)
That's causing doctest failures like this:
E AssertionError: 1 of 13 doctests failed for DataFrame.to_dict:
E **********************************************************************
E File "/opt/conda/envs/test/lib/python3.12/site-packages/cudf/core/dataframe.py", line 2362, in DataFrame.to_dict
E Failed example:
E df.to_dict(into=OrderedDict)
E Expected:
E OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])),
E ('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])
E Got:
E OrderedDict({'col1': OrderedDict({'row1': 1, 'row2': 2}), 'col2': OrderedDict({'row1': 0.5, 'row2': 0.75})})
E **********************************************************************
E 1 items had failures:
E 1 of 13 in DataFrame.to_dict
E ***Test Failed*** 1 failures.
E
E assert not 1
E + where 1 = TestResults(failed=1, attempted=13).failed
/__w/cudf/cudf/python/cudf/cudf/tests/test_doctests.py:125: AssertionError
This proposes just skipping those tests, here and in the docs for Series.to_dict()
.
@@ -8,6 +8,8 @@ filterwarnings = | |||
error | |||
ignore:::.*xdist.* | |||
ignore:::.*pytest.* | |||
# some third-party dependencies (e.g. 'boto3') still using datetime.datetime.utcnow() | |||
ignore:.*datetime.*utcnow.*scheduled for removal.*:DeprecationWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the cudf
and dask_cudf
tests on Python 3.12, there are a few dozen cases where DeprecationWarning
like the following are treated as errors
DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
Looking through the logs, those all appear to be coming from code in boto3
/ botocore
that we don't control. There are no direct uses of datetime.datetime.utcnow()
in this repo.
git grep utcnow
So this is proposing not failing tests when those warnings are raised.
Context on the deprecation of datetime.datetime.utcnow()
:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to scope this more narrowly? I haven't looked at the error logs to see how widespread this problem is, but I wonder if we can use something like a local context manager to wrap the places where this is occurring.
Let's not treat this as a blocker for this PR -- but we might want to make it a follow-up task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. It's coming from this codepath in botocore
self = <botocore.auth.S3SigV4Auth object at 0x7f6feb62f380>
request = <botocore.awsrequest.AWSRequest object at 0x7f700a3b3800>
def add_auth(self, request):
if self.credentials is None:
raise NoCredentialsError()
> datetime_now = datetime.datetime.utcnow()
E DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
https://github.com/rapidsai/cudf/actions/runs/10724599183/job/29742782194#step:7:4279
If it's possible to ignore warnings from specific dependencies, that'd be useful. I couldn't figure out how to do that in a few minutes of googling and docs reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's possible to ignore warnings from specific dependencies, that'd be useful. I couldn't figure out how to do that in a few minutes of googling and docs reading.
In a warning filter, after specifying the warning category you can also specify :module:line
https://docs.python.org/3/library/warnings.html#describing-warning-filters
So in a follow up PR, you can add :botocore
at the end to just ignore this warning for that module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mroeschke , I finally went back and did this. You suggestion worked perfectly 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments -- none are blocking. Let's merge this if CI passes and decide whether to pursue any follow-up tasks.
@@ -39,7 +39,7 @@ requirements: | |||
- python | |||
- rapids-build-backend >=0.3.0,<0.4.0.dev0 | |||
- setuptools | |||
- python-confluent-kafka >=1.9.0,<1.10.0a0 | |||
- python-confluent-kafka >=2.5.0,<2.6.0a0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating the Kafka dependencies is fine with me.
cc: @jdye64 for awareness.
@@ -120,7 +120,7 @@ def visit_Name(self, node): | |||
self.stack.append(ColumnReference(col_id)) | |||
|
|||
def visit_Constant(self, node): | |||
if not isinstance(node, (ast.Num, ast.Str)): | |||
if not isinstance(node.value, (float, int, str, complex)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me. Thanks.
@@ -8,6 +8,8 @@ filterwarnings = | |||
error | |||
ignore:::.*xdist.* | |||
ignore:::.*pytest.* | |||
# some third-party dependencies (e.g. 'boto3') still using datetime.datetime.utcnow() | |||
ignore:.*datetime.*utcnow.*scheduled for removal.*:DeprecationWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to scope this more narrowly? I haven't looked at the error logs to see how widespread this problem is, but I wonder if we can use something like a local context manager to wrap the places where this is occurring.
Let's not treat this as a blocker for this PR -- but we might want to make it a follow-up task.
/merge |
#16745 added support for Python 3.12 in this project. When that was merged, nightly `unit-tests-cudf-pandas` jobs on Python 3.12 started failing, with errors from compiling `pandas`: ([build link](https://github.com/rapidsai/cudf/actions/runs/10733915866/job/29768130164)) That's only happening because we're running `pip install pandas==2.1` in those jobs, which matches exactly `pandas==2.1.0`, which does not have Python 3.12 wheels on PyPI (https://pypi.org/project/pandas/2.1.0/#files). ```text Collecting pandas==2.1 Downloading pandas-2.1.0.tar.gz (4.3 MB) ``` `pandas==2.1.1` DOES have Python 3.12 wheels on PyPI (https://pypi.org/project/pandas/2.1.1/#files). To fix those jobs, this proposes allowing the patch version of `pandas` installed in those CI jobs to float: * before: `pip install pandas==2.1` * after: `pip install pandas==2.1.*` Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Mike Sarahan (https://github.com/msarahan) - Bradley Dice (https://github.com/bdice) URL: #16763
Follow-up to #1380. Now that both `cudf` (rapidsai/cudf#16745) and `ucxx` (rapidsai/ucxx#276) have Python 3.12 wheels available, it should be possible to test `dask-cuda` against Python 3.12 in CI. This proposes that. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #1382
Proposes some small changes I've taken as follow-ups from previous work here. * #16745 filtered out all linter warnings about uses of `datetime.utcnow()` ... this PR limits that to only the warnings observed from `botocore` (so that the linter will helpfully warn us about such uses directly in `cudf`) - ref #16745 (comment) * reduces the verbosity of logs for wheel builds (`-vvv` to `-v`) - similar to rapidsai/cugraph#4651 ## Notes for Reviewers This is intentionally targeted at `24.12`. No need to rush this into 24.10 before code freeze. ### How I tested this <details><summary>locally in docker (click me)</summary> ```shell docker run \ --rm \ --gpus 1 \ -v $(pwd):/opt/work \ -w /opt/work \ -it rapidsai/citestwheel:latest \ bash pip install \ --prefer-binary \ 'cudf-cu12[test]==24.10.*,>=0.0.0a0' \ 'flask' \ 'flask-cors' \ 'moto>=4.0.8' \ 'boto3' \ 's3fs>=2022.3.0' cd ./python/cudf pytest \ cudf/tests/test_s3.py ``` </details> Authors: - James Lamb (https://github.com/jameslamb) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16896
Proposes some small changes I've taken as follow-ups from previous work here. * #16745 filtered out all linter warnings about uses of `datetime.utcnow()` ... this PR limits that to only the warnings observed from `botocore` (so that the linter will helpfully warn us about such uses directly in `cudf`) - ref #16745 (comment) * reduces the verbosity of logs for wheel builds (`-vvv` to `-v`) - similar to rapidsai/cugraph#4651 ## Notes for Reviewers This is intentionally targeted at `24.12`. No need to rush this into 24.10 before code freeze. ### How I tested this <details><summary>locally in docker (click me)</summary> ```shell docker run \ --rm \ --gpus 1 \ -v $(pwd):/opt/work \ -w /opt/work \ -it rapidsai/citestwheel:latest \ bash pip install \ --prefer-binary \ 'cudf-cu12[test]==24.10.*,>=0.0.0a0' \ 'flask' \ 'flask-cors' \ 'moto>=4.0.8' \ 'boto3' \ 's3fs>=2022.3.0' cd ./python/cudf pytest \ cudf/tests/test_s3.py ``` </details> Authors: - James Lamb (https://github.com/jameslamb) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16896
Description
Contributes to rapidsai/build-planning#40
This PR adds support for Python 3.12.
Other changes required to add that support:
librdkafka
/python-confluent-kafka
,1.9.* -> 2.5.*
(link to thread)ast.Num
in syntax tree parsing, in favor of checking the.value
of anast.Constant
against a hard-coded list of builtin types (link to thread)datetime.datetime.utcnow()
(link to thread)repr()
on anOrderedDict
(link to thread)Notes for Reviewers
This is part of ongoing work to add Python 3.12 support across RAPIDS.
It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.
A follow-up PR will revert back to pointing at the
branch-24.10
branch ofshared-workflows
once allRAPIDS repos have added Python 3.12 support.
This will fail until all dependencies have been updates to Python 3.12
CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.
This can be merged whenever all CI jobs are passing.