Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu task is not working #1666

Open
ffais opened this issue Jan 14, 2025 · 1 comment · May be fixed by #1675
Open

Gpu task is not working #1666

ffais opened this issue Jan 14, 2025 · 1 comment · May be fixed by #1675
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@ffais
Copy link

ffais commented Jan 14, 2025

Generating images with nvidia drivers using gpu role with variables suggested in readme does not work.

The problem, as seen in the log below, comes from the missing definition of the nvidia_ceph variable.

Environment

  • Make target: make build-qemu-ubuntu-2204
  • Run using container image? (N):
  • Environment vars:

  • Vars file:
{
  "ansible_user_vars": "gpu_vendor=nvidia nvidia_s3_url=https://minio-api nvidia_bucket=nvidia-drivers nvidia_bucket_access=image-builder nvidia_bucket_secret=image-builder nvidia_installer_location=NVIDIA-Linux-x86_64-550.127.08.run",
  "node_custom_roles_pre": "gpu"
}

What steps did you take and what happened?

Run the make command 'PACKER_VAR_FILES="ubuntu-2204-nvidia.json" make build-qemu-ubuntu-2204' with the above var file.

What did you expect to happen?

Ubuntu image containing NVIDIA drivers

Relevant log output

Log Output
    qemu: TASK [gpu : Download NVIDIA driver installer file] *****************************
    qemu: fatal: [default]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'nvidia_ceph' is undefined. 'nvidia_ceph' is undefined\n\nThe error appears to be in '/home/ffais/work/image-builder/images/capi/ansible/roles/gpu/tasks/nvidia.yml': line 83, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Download NVIDIA driver installer file\n  ^ here\n"}

Anything else you would like to add?


/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 14, 2025
@mboersma mboersma added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 14, 2025
@ffais ffais linked a pull request Jan 20, 2025 that will close this issue
@ffais
Copy link
Author

ffais commented Jan 20, 2025

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants