Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda toolkit not installed for user #796

Closed
81reap opened this issue Feb 23, 2024 · 1 comment
Closed

cuda toolkit not installed for user #796

81reap opened this issue Feb 23, 2024 · 1 comment

Comments

@81reap
Copy link

81reap commented Feb 23, 2024

Describe the bug

Reopening this issue as it is still not resolved yet. But I have found a workaround.

On a clean install of the bazzite-nvidia branch, cuda does not work. This can be verified by ::

  1. Perform a clean install of bazzite-nvidia.
  2. Login as the user.
  3. Check for cuda by running nvcc --version. It will fail to find the command.

What did you expect to happen?

From the output of nvidia-smi on a clean install of the bazzite-nvidia branch I would expect there to be cuda installed on the machine and nvcc --version to work. This assumtion is futher supported by the fact that bazzite-nvidia comes preinstalled with cuda libs.

reap@fedora:~$ nvidia-smi
Thu Feb 22 18:58:12 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX 4000 SFF Ada ...    Off | 00000000:01:00.0 Off |                  Off |
| 30%   33C    P8               5W /  70W |      2MiB / 20475MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

reap@fedora:~$ rpm -qa | grep nvidia
nvidia-gpu-firmware-20240115-2.fc39.noarch
ublue-os-nvidia-addons-0.10-1.fc39.noarch
xorg-x11-drv-nvidia-cuda-libs-545.29.06-2.fc39.x86_64
nvidia-modprobe-545.29.06-1.fc39.x86_64
nvidia-persistenced-545.29.06-1.fc39.x86_64
nvidia-container-toolkit-base-1.14.5-1.x86_64
libnvidia-container1-1.14.5-1.x86_64
libnvidia-container-tools-1.14.5-1.x86_64
nvidia-container-toolkit-1.14.5-1.x86_64
xorg-x11-drv-nvidia-kmodsrc-545.29.06-2.fc39.x86_64
libva-nvidia-driver-0.0.11-1.fc39.x86_64
xorg-x11-drv-nvidia-libs-545.29.06-2.fc39.i686
xorg-x11-drv-nvidia-libs-545.29.06-2.fc39.x86_64
nvidia-settings-545.29.06-1.fc39.x86_64
xorg-x11-drv-nvidia-power-545.29.06-2.fc39.x86_64
kmod-nvidia-6.7.5-201.fsync.fc39.x86_64-545.29.06-3.fc39.x86_64
xorg-x11-drv-nvidia-545.29.06-2.fc39.x86_64
xorg-x11-drv-nvidia-cuda-libs-545.29.06-2.fc39.i686
xorg-x11-drv-nvidia-cuda-545.29.06-2.fc39.x86_64
xorg-x11-drv-nvidia-devel-545.29.06-2.fc39.x86_64

reap@fedora:~$ nvcc --version
# only works after the workaround

Output of rpm-ostree status

reap@fedora:~$ rpm-ostree status
State: idle
Deployments:
● ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
                   Digest: sha256:0c779386100258771e5947e913348d072935e2683f8301ca5c256d4b8f44bbb0
                  Version: 39.20240116.0 (2024-02-20T17:03:54Z)
                Initramfs: '"-I /etc/crypttab /usr/lib/modprobe.d/nvidia.conf"' 

  ostree-unverified-image:docker://ghcr.io/ublue-os/bazzite-nvidia:latest
                   Digest: sha256:0c779386100258771e5947e913348d072935e2683f8301ca5c256d4b8f44bbb0
                  Version: 39.20240116.0 (2024-02-20T17:03:54Z)
                Initramfs: '"-I /etc/crypttab /usr/lib/modprobe.d/nvidia.conf"'

Hardware

B550I Aurus Pro AX
AMD Ryzen 7 5700G
Nvidia RTX 4000 SFF Ada Gen
2x32GB @ 3200 MHz
2TB NVME Drive

Setup Notes

  • Secureboot is disabled in the BIOS.
  • OS and KDE run on the AMD GPU. Steam Games are able to successfully launch on the Nvidia gpu.

Extra information or context

The Workaround
note :: The workaround does not fix the issue for podman containers. Any cuda required containers will have to be run in the userspace.

$ nvidia-smi
# this shows the correct output and says that cuda 12.3 is installed
$ nvcc --version
# this should fail to find nvcc
$ ls /etc/local
# this output does not contain cuda which confirms that the cuda toolkit is not installed

$ wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda_12.3.2_545.23.08_linux.run
$ sudo sh cuda_12.3.2_545.23.08_linux.run
# this will require you to accept the licence first. You should only be installing the cuda drivers as the system already has nvidia drivers.
$ ls /etc/local
# now we have the cuda toolkit, but nvcc will still fail as it is not on your path

# add this to your ~/.bashrc so that it is loaded every boot
$ export PATH=/usr/local/cuda-12.3/bin${PATH:+:${PATH}}
$ export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
$ nvcc --version 
# nvcc now works
@81reap
Copy link
Author

81reap commented Feb 23, 2024

moving issue to upstream nvidia repo :: ublue-os/hwe#198

@81reap 81reap closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant