-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install nvidia drivers and container toolkit #1
base: main
Are you sure you want to change the base?
Conversation
to simplify sandbox credential usage
to reuse logic from bin/patch/ubuntu24-x64 ``` # custom stuff mkdir -p $custom_dir cp -r patches/ubuntu/files $custom_dir/ ```
to reuse logic from bin/patch/ubuntu24-x64
the same type that is running the RunsOn runner now
to work around availability_zone issues
Example of successfully dispatched workflow building and testing a GPU enabled AMI for RunsOn: |
@@ -7,14 +7,17 @@ on: | |||
type: string | |||
required: true | |||
description: 'Distribution(s) to build' | |||
default: '["ubuntu22-full-x64", "ubuntu22-full-arm64", "ubuntu24-full-x64", "ubuntu24-full-arm64"]' | |||
# default: '["ubuntu22-full-x64", "ubuntu22-full-arm64", "ubuntu24-full-x64", "ubuntu24-full-arm64"]' | |||
default: '["ubuntu24-full-x64"]' | |||
# schedule: | |||
# - cron: '0 8 */15 * *' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@biofotis or @jamescnewman , if you're good with this patch, then I may recommend re-enabling the schedule event before merging into the default branch of our fork, so that our GPU AMI's are kept up to date until we can swap back to upstream RunsOn AMIs with GPU support. If you make the commit to change this line, then you'll also be notified scheduled workflows to keep tabs.
Notifications for scheduled workflows are sent to the user who last modified the cron syntax in the workflow file. For more information, see Notifications for workflow runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should discuss offline in Slack before finalising and update this PR with the result.
This patches the existing packer template for building GPU enabled AWS AMIs for RunsOn runners, specifically just for Ubuntu 24.04, the latest LTS release. This simply done by adding two additional scrips for installing Nvidia drivers and container toolkit, both necessary for GPU acceleration with containerized jobs for GitHub Action via RunsOn.
Additionally, this also customized the CI/CD infrastructure for our own use case and AWS regions of interest. This simplifies the process to maintain our custom AMIs while we wait for such options to be up streamed into RunsOn. The tests jobs are also altered to include sanity checks and verify installation of NVIDIA software on a GPU enabled instance.
TODO: