Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

standardize and consolidate wheel installations in testing scripts #16575

Merged
merged 8 commits into from
Aug 19, 2024
13 changes: 9 additions & 4 deletions ci/cudf_pandas_scripts/pandas-tests/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,15 @@ rapids-logger "Running Pandas tests using $PANDAS_TESTS_BRANCH branch and rapids
rapids-logger "PR number: ${RAPIDS_REF_NAME:-"unknown"}"

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-pylibcudf-dep
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-cudf-dep
python -m pip install $(ls ./local-pylibcudf-dep/pylibcudf*.whl)
python -m pip install $(ls ./local-cudf-dep/cudf*.whl)[test,pandas-tests]

# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
vyasr marked this conversation as resolved.
Show resolved Hide resolved
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't caused by this PR, but let's get these extras fixed here since they aren't quite right. pylibcudf doesn't have a pandas-tests extra. It does have a test extra, but I wouldn't expect it to be needed in this test run.

Suggested change
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]"
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like with this combined approach of installation, even though you're explicitly specifying to install pylibcudf that package doesn't get included in dependency resolution for cudf. The two step process might be necessary, or you can switch to adding --find-links to the command.

Copy link
Member Author

@jameslamb jameslamb Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see the failure at https://github.com/rapidsai/cudf/actions/runs/10425017056/job/28875933840?pr=16575.

./dist./distLooking in indexes: https://pypi.org/simple, https://pypi.anaconda.org/rapidsai-wheels-nightly/simple, https://pypi.nvidia.com/
Processing ./dist/cudf_cu12-24.10.0a205-cp310-cp310-manylinux_2_28_x86_64.whl
Processing ./dist/pylibcudf_cu12-24.10.0a205-cp310-cp310-manylinux_2_28_x86_64.whl
...
ERROR: Could not find a version that satisfies the requirement pylibcudf-cu12==24.10.*,>=0.0.0a0 (from cudf-cu12[test]) (from versions: none)
ERROR: No matching distribution found for pylibcudf-cu12==24.10.*,>=0.0.0a0

But did you notice that the CUDA 11.8 version of that same job, using the same script, succeeded?

Processing ./dist/cudf_cu11-24.10.0a205-cp311-cp311-manylinux_2_28_aarch64.whl (from cudf-cu11==24.10.0a205)
Processing ./dist/pylibcudf_cu11-24.10.0a205-cp311-cp311-manylinux_2_28_aarch64.whl (from pylibcudf-cu11==24.10.0a205)
...
Successfully installed ... cudf-cu11-24.10.0a205 ... pylibcudf-cu11-24.10.0a205 ...

(build link)

If this method of dependency resolution didn't work, I'd have expected those jobs to both fail in the same way. I'm going to merge in latest branch-24.10 + apply your suggestion, let's see what happens on a re-run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that thought flashed in and out of my brain while reviewing your PR and left no trace 😂 let's see what we see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still failing, so there is something strange happening here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, thanks!! I hadn't noticed the different pip versions!!

That's totally it.

I was able to reproduce the failure locally.

code to do that (click me)
docker run \
    --rm \
    --env RAPIDS_REPOSITORY=rapidsai/cudf \
    --env _RAPIDS_SHA=7a12b67bd17056d4c980a3c6e89cc8ec9a6bd7c5_ \
    --env RAPIDS_REF_NAME=pull-request/16575 \
    --env RAPIDS_BUILD_TYPE="pull-request" \
    -it rapidsai/citestwheel:cuda12.5.1-ubuntu22.04-py3.10 \
    bash

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

python -m pip install -v \
  "$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
  "$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

Inspected the package metadata with pkginfo, didn't see any issues... names, versions, and dependency lists all looked correct.

Upgraded pip (to 24.2) and tried again:

pip install --upgrade pip

python -m pip install -v \
  "$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
  "$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

That succeeded 🎉

This makes me think we should be running pip install --upgrade pip every time the images at https://github.com/rapidsai/ci-imgs are rebuilt. Can you think of strong reason we shouldn't do that? If not, I'll put up a PR proposing that and see what others say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main argument against updating pip unconditionally everywhere is that we shouldn't require our users to always have the latest version of pip to be able to install our software. This has come up before in devcontainers, for instance, where the latest version of pip available as a system-installable package is significantly older than the latest version available from PyPI, and the question is raised whether it's really a good idea to require a newer one.

For now, for the purpose of CI I think it's OK to always update so that at least we have consistent behavior. At some point we might want to introduce some form of testing to ensure that our packages are actually installable with different versions of pip, though. We have in the past done things (e.g. the rapids-dask-dependency shenanigans) that ultimately require a fairly new version of pip, which could be an onerous constraint on some users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, ok yeah that makes sense.

I've opened rapidsai/ci-imgs#171 proposing upgrading pip in the wheel CI images.

At some point we might want to introduce some form of testing to ensure that our packages are actually installable with different versions of pip, though.

I agree. This type of thing should be tested directly (similar to rapidsai/build-planning#81) if we want to have high confidence in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the new CI images are published (https://github.com/rapidsai/ci-imgs/actions/runs/10457672191), I've restarted CI here. Merge latest branch-24.10 into this (figured I might as well, to get a bit more useful test coverage, if I'm going to re-run so many jobs anyway).

Hopefully with newer pip across all the CI images, this newer, stricter pattern for these scripts will work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 looks like that worked!

https://github.com/rapidsai/cudf/actions/runs/10458119570/job/28961862201?pr=16575

Thanks a lot for the help, @vyasr !


RESULTS_DIR=${RAPIDS_TESTS_DIR:-"$(mktemp -d)"}
RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${RESULTS_DIR}/test-results"}/
Expand Down
13 changes: 9 additions & 4 deletions ci/cudf_pandas_scripts/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,15 @@ if [ "$no_cudf" = true ]; then
echo "Skipping cudf install"
else
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-pylibcudf-dep
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-cudf-dep
python -m pip install $(ls ./local-pylibcudf-dep/pylibcudf*.whl)
python -m pip install $(ls ./local-cudf-dep/cudf*.whl)[test,cudf-pandas-tests]

# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test,cudf-pandas-tests]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"
fi

python -m pytest -p cudf.pandas \
Expand Down
11 changes: 6 additions & 5 deletions ci/test_wheel_cudf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@

set -eou pipefail

# Download the pylibcudf built in the previous step
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-pylibcudf-dep

# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

# Install both pylibcudf and cudf
# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
"$(echo ./local-pylibcudf-dep/pylibcudf*.whl)[test]" \
"$(echo ./dist/cudf*.whl)[test]"
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

RESULTS_DIR=${RAPIDS_TESTS_DIR:-"$(mktemp -d)"}
RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${RESULTS_DIR}/test-results"}/
Expand Down
15 changes: 10 additions & 5 deletions ci/test_wheel_cudf_polars.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,17 @@ fi
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="cudf_polars_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="1" rapids-download-wheels-from-s3 ./dist

# Download the cudf built in the previous step
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-pylibcudf-dep
python -m pip install ./local-pylibcudf-dep/pylibcudf*.whl
# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

rapids-logger "Install cudf_polars"
python -m pip install $(echo ./dist/cudf_polars*.whl)[test]
rapids-logger "Installing cudf_polars and its dependencies"

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)" \
"$(echo ./dist/cudf_polars_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"

rapids-logger "Run cudf_polars tests"

Expand Down
14 changes: 7 additions & 7 deletions ci/test_wheel_dask_cudf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ set -eou pipefail
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="dask_cudf_${RAPIDS_PY_CUDA_SUFFIX}" RAPIDS_PY_WHEEL_PURE="1" rapids-download-wheels-from-s3 ./dist

# Download the cudf built in the previous step
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-pylibcudf-dep
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./local-cudf-dep
python -m pip install \
"$(echo ./local-pylibcudf-dep/pylibcudf*.whl)" \
"$(echo ./local-cudf-dep/cudf*.whl)"
# Download the cudf and pylibcudf built in the previous step
RAPIDS_PY_WHEEL_NAME="cudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install $(echo ./dist/dask_cudf*.whl)[test]
python -m pip install \
"$(echo ./dist/cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)" \
"$(echo ./dist/dask_cudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]" \
"$(echo ./dist/pylibcudf_${RAPIDS_PY_CUDA_SUFFIX}*.whl)"

RESULTS_DIR=${RAPIDS_TESTS_DIR:-"$(mktemp -d)"}
RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${RESULTS_DIR}/test-results"}/
Expand Down
Loading