Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept other user configuration such as AMD GPU / iGPU / CPU rendering #62

Open
Anghille opened this issue Oct 23, 2024 · 11 comments
Open

Comments

@Anghille
Copy link

I might be off-topic, since this project states this is for Nvidia and HPC focused jobs.

I was just wondering if it was possible (not asking you guys to do it, but at least tell me if it is) to modify this image to accept multiple configuration such as AMD GPU (server and consumer), or best, only use CPU encoding/decoding for more portability/compatibility.

I have some use cases running this on some kubernetes clusters, with various configurations (Home and cloud clusters). I am diving into the image and scripts, trying to understand WHY nvidia is required and how I might be able to change it to something else, or even better, make the image dynamically be able to run on AMD; NVIDIA or just CPU of any kind.

If you have any advices, that would be awesome 🙏

@ehfd
Copy link
Member

ehfd commented Oct 24, 2024

The reason only NVIDIA works here is because different GPU vendors need an immensely different /etc/X11/xorg.conf configuration.

I imagine something similar to https://github.com/nestriness/nestri/pull/84/files would work.
@DatCaptainHorse, our collaborator, has expanded the X server invocation to AMD and Intel GPUs here, and the script there is expanded from here. He should be able to help you.

Contributions or spinoffs are always welcome, although the code must be clean and conform to the repository style to be merged.

@ehfd
Copy link
Member

ehfd commented Oct 24, 2024

For CPUs, as well as Intel and AMD GPUs for now, docker-nvidia-egl-desktop should currently work; it uses a virtual X11 server in the first place anyway. But there is a performance overhead that trades off compatibility and the above link implements a real X11 server on AMD and Intel GPUs.

@DatCaptainHorse
Copy link

As to how I managed to break coax X11 to work without physical output with modesetting driver on Intel/AMD GPUs, it required basically having a base template xorg.conf and then using xrandr to create the wanted output mode, setting it to an active state.

It's definitely not expected behaviour and may be patched out by X11 devs someday, but outside Wayland it was only way to get things working 😅

@Anghille I'm currently busy with Nestri work to contribute big changes here, however if you want to take a look, I can answer any questions to the best of my extent 🙂

@Anghille
Copy link
Author

I will look into that then ! If I have any ideas / PR to do, I would gladly do so ! Thanks for the answer

@ehfd
Copy link
Member

ehfd commented Oct 26, 2024

@Anghille Yeah, thanks. The solution is already there with @DatCaptainHorse, it's just a matter of integrating it seamlessly.
I do not have an AMD device currently, so this is required by somebody who does.

@ehfd
Copy link
Member

ehfd commented Oct 26, 2024

VirtualGL/virtualgl#229 (comment)
VirtualGL/virtualgl#37

@DatCaptainHorse BTW, does Vulkan work correctly on your approach with Intel or AMD in an unprivileged container? The above implies that it might be finicky in non-NVIDIA.

@DatCaptainHorse
Copy link

@ehfd Yep, Vulkan works with my approach, if DXVK didnt work it would've been less useful 🙂

Just need to pass in GPU with the docker/podman --device parameter.

@Anghille
Copy link
Author

If anything, Can I do a PR with a Kubernetes setup (configmap, secrets, deployment and volumes) for other users reference in the doc files ? Or is it redundant with other repo such as your kubernetes-operator ?

@Anghille
Copy link
Author

Anghille commented Oct 28, 2024

By the way, that Might be a nice addition to have an ENV GPU={AMD,NVIDIA} wich means the user can seamlessly integreate the image for a cluster that has both graphic cards setup (which is my case for exemple)

Something that integrate the works from here: https://github.com/nestriness/nestri/pull/84/files, but also keep NVIDIA setup as well. Not sure if this is something you want or if you prefer to keeps thing separated ?

@DatCaptainHorse
Copy link

@Anghille my changes there support this kind of ENV variable: GPU_SELECTION=vendor:N where vendor is GPU vendor (nvidia, intel, amd..) and N is the per-vendor GPU number index.

i.e. if you have 2 NVIDIA GPUs, you'd do either GPU_SELECTION=nvidia:0 for first GPU, GPU_SELECTION=nvidia:1 for second.

For mixed scenario (AMD + NVIDIA), you'd choose AMD one with GPU_SELECTION=amd:0 or NVIDIA one with GPU_SELECTION=nvidia:0, the N-number is per-vendor, starting from 0.

But, I don't see why it wouldn't be possible to use the GPU name as parameter either, the GPU helper script is easily enough modifiable.

Installing dependency of the script (lshw), then doing source gpu_helpers.sh and then list_available_gpus in terminal should output you the available GPUs, for example like so:

$ list_available_gpus
Available GPUs:
 [intel:0] "DG2 [Arc A380]" @[pci@0000:05:00.0]

Just look at the script source and you can see how it all works, hack away 🙂

@ehfd
Copy link
Member

ehfd commented Oct 28, 2024

@Anghille Check xgl.yml for the Kubernetes deployment. We only ship Deployment configurations because others wildly vary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants