Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to NVKS for amd64 CI runners #6280

Open
wants to merge 3 commits into
base: branch-25.04
Choose a base branch
from

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 30, 2025

This migrates amd64 CI jobs (PRs and nightlies) to use L4 GPUs from the NVKS cluster.

xref: https://github.com/rapidsai/build-infra/issues/184

@bdice bdice requested a review from a team as a code owner January 30, 2025 18:28
@bdice bdice requested a review from msarahan January 30, 2025 18:28
@bdice bdice added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Jan 30, 2025
@jakirkham
Copy link
Member

Is there something still needed for CUDA 12.8 here?

Seeing the following error on CI:

Traceback (most recent call last):
  File "/opt/conda/bin/rapids-dependency-file-generator", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/rapids_dependency_file_generator/_cli.py", line 125, in main
    make_dependency_files(
  File "/opt/conda/lib/python3.12/site-packages/rapids_dependency_file_generator/_rapids_dependency_file_generator.py", line 474, in make_dependency_files
    raise ValueError(f"No matching matrix found in '{include}' for: {matrix_combo}")
ValueError: No matching matrix found in 'cuda_version' for: {'cuda': '12.8', 'arch': 'x86_64'}

Maybe we need to resolve the forward merger: #6272

@dantegd
Copy link
Member

dantegd commented Jan 31, 2025

@bdice the current failures are because the jobs are picking a build of cudf nightly that doesn't have the fixes that 25.02a358 has

@jameslamb
Copy link
Member

The most recent cuDF branch build finished a couple minutes ago (https://github.com/rapidsai/cudf/actions/runs/13078947512). I've merged brnach-25.04 in here to restart CI and pull in those new packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants