Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use rapids infra to run testing #1216

Merged
merged 10 commits into from
Oct 15, 2023
43 changes: 28 additions & 15 deletions .github/workflows/gpu.yml
Original file line number Diff line number Diff line change
@@ -1,29 +1,34 @@
name: gpu-ci
name: GPU CI

on:
workflow_dispatch:
push:
branches: [main]
branches:
- main
- "pull-request/[0-9]+"
tags:
- "v[0-9]+.[0-9]+.[0-9]+"
pull_request:
branches: [main]
types: [opened, synchronize, reopened]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
gpu-ci:
runs-on: 1GPU
runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 14 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

Check failure on line 14 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvcr.io/nvstaging/merlin/merlin-ci-runner:latest
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
options: --shm-size=1G
credentials:
username: $oauthtoken
password: ${{ secrets.NGC_TOKEN }}

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Run tests
run: |
nvidia-smi
pip install tox
ref_type=${{ github.ref_type }}
branch=main
if [[ $ref_type == "tag"* ]]
Expand All @@ -34,17 +39,25 @@
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
cd ${{ github.workspace }}; PYTEST_MARKERS="unit and not (examples or integration or notebook) and (singlegpu or not multigpu) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu

tests-examples:
runs-on: 1GPU
PYTEST_MARKERS="unit and not (examples or integration or notebook) and (singlegpu or not multigpu) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu

gpu-ci-examples:
runs-on: linux-amd64-gpu-p100-latest-1

Check failure on line 45 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

Check failure on line 45 in .github/workflows/gpu.yml

View workflow job for this annotation

GitHub Actions / actionlint

label "linux-amd64-gpu-p100-latest-1" is unknown. available labels are "windows-latest", "windows-2022", "windows-2019", "windows-2016", "ubuntu-latest", "ubuntu-22.04", "ubuntu-20.04", "ubuntu-18.04", "macos-latest", "macos-12", "macos-12.0", "macos-11", "macos-11.0", "macos-10.15", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "1GPU", "2GPU". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file
container:
image: nvcr.io/nvstaging/merlin/merlin-ci-runner:latest
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
options: --shm-size=1G
credentials:
username: $oauthtoken
password: ${{ secrets.NGC_TOKEN }}
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Run tests
run: |
pip install tox
ref_type=${{ github.ref_type }}
branch=main
if [[ $ref_type == "tag"* ]]
Expand All @@ -55,4 +68,4 @@
if [[ "${{ github.ref }}" != 'refs/heads/main' ]]; then
extra_pytest_markers="and changed"
fi
cd ${{ github.workspace }}; PYTEST_MARKERS="(examples or notebook) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
PYTEST_MARKERS="(examples or notebook) $extra_pytest_markers" MERLIN_BRANCH=$branch COMPARE_BRANCH=${{ github.base_ref }} tox -e gpu
Loading