Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing steps in podman instructions when using nvidia GPU #1093

Open
nalabrie opened this issue Sep 2, 2024 · 2 comments
Open

missing steps in podman instructions when using nvidia GPU #1093

nalabrie opened this issue Sep 2, 2024 · 2 comments

Comments

@nalabrie
Copy link

nalabrie commented Sep 2, 2024

Hello, I am running Fedora 40 Kinoite (Bazzite) with an NVIDIA GeForce RTX 4070 Ti (proprietary drivers, version 560.35.03). My aim was to install Jellyfin via a podman systemd container file and use GPU acceleration. I followed the instructions here but it did not work. After hours of debugging, I found a few issues with these instructions:

  1. AddDevice=/dev/dri/:/dev/dri/ is not enough. I also needed to add AddDevice=nvidia.com/gpu=all and have the NVIDIA Container Toolkit installed on my system (podman specific docs here). I did not need to install this, on my machine it was already available so I cannot assist with proper install instructions. I am unsure if AddDevice=/dev/dri/:/dev/dri/ is still needed after this, I just left it there as well to be safe.
  2. The instructions currently say SecurityLabelDisable=true is only needed for versions of container-selinux < 2.226. After making the changes mentioned above, it was still not working. I ran rpm -q container-selinux to see my version is container-selinux-2.232.1-1.fc40.noarch. I added SecurityLabelDisable=true anyway since the nvidia docs instruct to do so, and now everything is working.

Here is my full jellyfin.container file:

[Unit]
Description=jellyfin

[Container]
Image=docker.io/jellyfin/jellyfin:latest
AutoUpdate=registry
PublishPort=8096:8096/tcp
UserNS=keep-id
SecurityLabelDisable=true
AddDevice=/dev/dri/:/dev/dri/
AddDevice=nvidia.com/gpu=all
Volume=jellyfin-config:/config:Z
Volume=jellyfin-cache:/cache:Z
Volume=/mnt/elements/media:/media:Z

[Service]
# Inform systemd of additional exit status
SuccessExitStatus=0 143

[Install]
# Start by default on boot
WantedBy=default.target

In addition to verifying the GPU acceleration was working by transcoding some media successfully, I also ran podman exec -it systemd-jellyfin bash to get a bash shell inside the container and then ran nvidia-smi. Here is the output of that:

Mon Sep  2 16:43:10 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 Ti     On  |   00000000:2D:00.0  On |                  N/A |
|  0%   42C    P8             11W /  285W |     649MiB /  12282MiB |     11%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

If the container could not access the GPU then nvidia-smi would return command not found. I can also see the GPU load increasing when using a system monitor and playing media that needs transcoded.

@nalabrie
Copy link
Author

nalabrie commented Sep 2, 2024

I'm happy to make a PR but wanted to post this as an issue first to see if anyone else can confirm if these changes are needed or if I'm missing/misunderstanding anything.

@felix920506
Copy link
Member

If you are unsure please ask in our matrix/discord channels. https://jellyfin.org/contact

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants