[RELEASE] raft v24.12 #2505

raydouglass · 2024-11-21T20:50:21Z

❄️ Code freeze for `branch-24.12` and v24.12 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-24.12 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-24.12 into main for the release

Forward-merge branch-24.10 into branch-24.12

Merge branch-24.10 into branch-24.12

Contributes to rapidsai/build-planning#94 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #2466

This PR updates all the RMM imports to use pylibrmm/librmm now that `rmm._lib` is deprecated . It should be merged after [rmm/1676](rapidsai/rmm#1676). Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Ben Frederickson (https://github.com/benfred) URL: #2451

Forward-merge branch-24.10 into branch-24.12

Contributes to rapidsai/build-planning#106 Proposes specifying the RAPIDS version in `conda install` calls in CI that install CI artifacts, to reduce the risk of CI jobs picking up artifacts from other releases. Also proposes combining together successive `pip install` calls. `pip install AB` is safer than `pip install A; pip install B` because `pip` doesn't take the current set of installed packages into consideration when it installs new packages. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #2467

Follow-up PR to address feedback: #2474 (comment) Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #2475

The 12.6.1 CUDA compiler has issues with enable_if inside the template arguments of some kernels. We can simplify kernel logic and remove the usage of enable_if. Authors: - Robert Maynard (https://github.com/robertmaynard) - Paul Taylor (https://github.com/trxcllnt) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #2469

This PR is replacing the `VAULT_HOST` variable with `AWS_ROLE_ARN`. This is required to use the new token service to get AWS credentials. Authors: - Jordan Jacobelli (https://github.com/jjacobelli) Approvers: - Paul Taylor (https://github.com/trxcllnt) - Bradley Dice (https://github.com/bdice) URL: #2472

Contributes to rapidsai/build-planning#111 Proposes some small packaging/CI changes, matching similar changes being made across RAPIDS. * printing `sccache` stats to CI logs * reducing `pip`'s verbosity in wheel building scripts * updating to the latest `rapids-dependency-file-generator` (v1.16.0) * always explicitly specifying `cpp` / `python` in calls to `rapids-upload-wheels-to-s3` ## Notes for Reviewers This originally also ran wheel builds with `--no-build-isolation`, but I reverted that based on rapidsai/build-planning#108 (comment). Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2470

`thrust::host_vector` initializes its elements at creation and requires the element type be default-constructible. This translates to `raft::pinned_mdarray` and makes the mdarray unusable for non-default-constructible objects, like `cuda::atomic<>` (and many user-defined types). This is against all other mdarray types in raft, which are based on `rmm::device_uvector` and are not initialized at construction time. The PR changes the underlying container to a plain pointer + cudaMallocHost/cudaFreeHost. **Breaking change**: if anyone relies on the `pinned_mdarray` to initialize itself, the code will break (but mdarrays should not initialize at construction in raft anyway). The affected classes have different private members now, so the ABI changes as well. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - William Hicks (https://github.com/wphicks) URL: #2478

Here is the results of looking at the cudaPointerGetAttributes of different allocation types on Grace + Hopper. Allocations of `malloc` are still usable on the GPU. ``` ccudaPointerGetAttributes attributes malloc ptr is_dev_ptr -> 1 is_host_ptr -> 1 memory loc -> unregistered cudaPointerGetAttributes attributes cudaMalloc ptr is_dev_ptr -> 1 is_host_ptr -> 0 memory loc -> device cudaPointerGetAttributes attributes cudaMallocManaged cudaMemAttachGlobal ptr is_dev_ptr -> 1 is_host_ptr -> 1 memory loc -> managed ``` Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Micka (https://github.com/lowener) URL: #2480

This project is incompatible with newer versions of `cuda-python`. This puts ceilings of `<=11.8.3` (CUDA 11) and `<=12.6.0` (CUDA 12) on that library. Those ceilings should be removed and replaced with `!=` constraints once new releases of `cuda-python` are up that this project is compatible with. See rapidsai/build-planning#116 for more information. Authors: - Bradley Dice (https://github.com/bdice) - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #2486

…on Cython dependency (#2490) Contributes to rapidsai/build-planning#110 Proposes adding 2 types of validation on wheels in CI, to ensure we continue to produce wheels that are suitable for PyPI. * checks on wheel size (compressed), - *to be sure they're under PyPI limits* - *and to prompt discussion on PRs that significantly increase wheel sizes* * checks on README formatting - *to ensure they'll render properly as the PyPI project homepages* - *e.g. like how https://github.com/scikit-learn/scikit-learn/blob/main/README.rst becomes https://pypi.org/project/scikit-learn/* Also puts a ceiling on Cython to its latest stable release (`<=3.0.11`), to fix #2490 (comment). Work to relax that is tracked in (#2491). Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2490

A simple PR for pinning the FAISS version to fetch the compatible tag for raft-ann-bench Authors: - Tarang Jain (https://github.com/tarang-jain) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #2496

This PR removes raft-ann-bench from the conda packages, build system, and documentation. This removal was previously announced for the 24.12 release in #2448. Authors: - Corey J. Nolet (https://github.com/cjnolet) - Bradley Dice (https://github.com/bdice) Approvers: - Ben Frederickson (https://github.com/benfred) - Bradley Dice (https://github.com/bdice) URL: #2497

We are keeping random ball cover headers in RAFT for 24.12, and random ball cover depends on distances and brute-force. Because of this, we're going to leave all of the VSS headers in RAFT for the time being, and will remove them all in a future PR once RBC is formally migrated to cuVS. The tests, benchmarks, and instantiations for all of these APIs will be removed, though, so while the actual headers can still be used, they are no longer being tested and could fail without warning. I've also included a note to users in the README about this, stating to use at their own risk. Authors: - Corey J. Nolet (https://github.com/cjnolet) - Bradley Dice (https://github.com/bdice) Approvers: - Ben Frederickson (https://github.com/benfred) - Bradley Dice (https://github.com/bdice) URL: #2498

…ng ignored in headers (#2501) Authors: - Corey J. Nolet (https://github.com/cjnolet) - Bradley Dice (https://github.com/bdice) Approvers: - Divye Gala (https://github.com/divyegala) URL: #2501

Follow-up to #2498 (comment). This removes an extraneous file in `cpp/template` that resulted from an incorrect merge conflict resolution and a leftover reference to the template project in `update-version.sh`. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #2500

@aamijar

I unfortunately don't have permissions to push on @aamijar branch for the previous Lanczos solver PR (#2416) so I kept his commits and continued it here. ## Lanczos Solver for Sparse Eigen Decomposition We propose a new lanczos solver in raft that fixes the issues present in the previous solver `raft::sparse::solver::detail::computeSmallestEigenvectors`. Specifically we address the following issues: 1. Numerical Stability for both float32 and float64 datatypes 2. Efficiency and Speed of Convergence This new implementation is taken from the cupy library `cupyx.scipy.sparse.linalg.eigsh` where the thick-restart and full reorthogonalzation methods are used. Additionally this PR exposes a python api for raft lanczos solver with an interface similar to `scipy.sparse.linalg.eigsh` and `cupyx.scipy.sparse.linalg.eigsh`. ```py3 from pylibraft.solver import eigsh ``` Authors: - Micka (https://github.com/lowener) - Anupam (https://github.com/aamijar) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Corey J. Nolet (https://github.com/cjnolet) URL: #2481

This upgrade is important because cutlass 3.4 fixed issues with kernel visibility that would otherwise lead to global kernel symbols from raft kernels using cutlass. Authors: - Vyas Ramasubramani (https://github.com/vyasr) - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Corey J. Nolet (https://github.com/cjnolet) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #2503

review-notebook-app · 2024-11-21T20:50:29Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

raydouglass and others added 30 commits September 19, 2024 12:04

DOC v24.12 Updates [skip ci]

12537c5

Merge pull request #2442 from rapidsai/branch-24.10

3c24cb9

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2444 from rapidsai/branch-24.10

4354fb9

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2445 from rapidsai/branch-24.10

9ffd49a

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2449 from rapidsai/branch-24.10

03f8025

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2452 from rapidsai/branch-24.10

e6174f2

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2454 from rapidsai/branch-24.10

4c4d9bc

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2455 from rapidsai/branch-24.10

c8957bc

Forward-merge branch-24.10 into branch-24.12

Merge pull request #2456 from rapidsai/branch-24.10

c0379bb

Forward-merge branch-24.10 into branch-24.12

Merge branch-24.10

3c85ad9

Merge pull request #2461 from jameslamb/branch-24.12-merge-24.10

90e62e0

Merge branch-24.10 into branch-24.12

Prune workflows based on changed files (#2466)

94cc2c2

Contributes to rapidsai/build-planning#94 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #2466

Merge pull request #2468 from rapidsai/branch-24.10

5e18c85

Forward-merge branch-24.10 into branch-24.12

Use Python for computation. (#2474)

714e07b

Use environment variables in cache hit rate computation. (#2475)

8dc8245

Follow-up PR to address feedback: #2474 (comment) Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #2475

Pin FAISS Version for raft-ann-bench (#2496)

8a9bf5c

A simple PR for pinning the FAISS version to fetch the compatible tag for raft-ann-bench Authors: - Tarang Jain (https://github.com/tarang-jain) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #2496

Removing some left over places where implicit instantiations were bei…

7ab2b3d

…ng ignored in headers (#2501) Authors: - Corey J. Nolet (https://github.com/cjnolet) - Bradley Dice (https://github.com/bdice) Approvers: - Divye Gala (https://github.com/divyegala) URL: #2501

raydouglass requested review from a team as code owners November 21, 2024 20:50

raydouglass requested review from msarahan and removed request for a team November 21, 2024 20:50

github-actions bot added cpp CMake python ci labels Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] raft v24.12 #2505

[RELEASE] raft v24.12 #2505

raydouglass commented Nov 21, 2024

review-notebook-app bot commented Nov 21, 2024

[RELEASE] raft v24.12 #2505

Are you sure you want to change the base?

[RELEASE] raft v24.12 #2505

Conversation

raydouglass commented Nov 21, 2024

❄️ Code freeze for branch-24.12 and v24.12 release

What does this mean?

What is the purpose of this PR?

review-notebook-app bot commented Nov 21, 2024

❄️ Code freeze for `branch-24.12` and v24.12 release