Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Enable Intel GPU support in torchcodec (pytorch xpu backend device) #559

Open
dvrogozh opened this issue Mar 13, 2025 · 0 comments · May be fixed by #558
Open

[RFC] Enable Intel GPU support in torchcodec (pytorch xpu backend device) #559

dvrogozh opened this issue Mar 13, 2025 · 0 comments · May be fixed by #558

Comments

@dvrogozh
Copy link

dvrogozh commented Mar 13, 2025

That's an RFC issue for the following PR:

The PR enables Intel GPU support in torchcodec by adding ffmpeg-vaapi decoding and connecting it with pytorch XPU device backend.

There are few items I would like to bring to discussion in this RFC issue.

Selection of ffmpeg backend. There are few which are available for Intel GPU:

  • ffmpeg-vaapi
  • ffmpeg-dx11/dx12
  • ffmpeg-qsv (libvpl based one)
  • ffmpeg-vulkan

None of the above media backends are preferable in the sense of easier context/memory sharing. Intel media APIs, drivers and libraries are not directly intersect with Intel compute stack. In particular, Intel compute's Unified Shared Memory pointers are note recognized by media APIs and can not be accepted directly. This means that memory sharing between media and compute must be done (at the moment) via lower level APIs such as DMA fds on Linux and NT handles on Windows. This gives OS specific dependency.

I suggest to consider ffmpeg-vaapi for Linux and ffmpeg-dx12 for Windows to enable Intel GPUs. These are backends based on Intel media driver APIs. ffmpeg-qsv is based on a higher level library (libvpl) and does not allow to avoid vaapi/dx dependency since we in any case will need to use these APIs to get to the underlying surface memory. I also think that ffmpeg-vaapi is used by AMD GPUs, thus adding support of VAAPI can help here as well. (See also #444)

ffmpeg-vulkan might be interesting due to eventual cross-vendor capabilities. However 1) media support in vulkan is recently new and I am not sure that all required features are available (for example - color space conversion to RGBA), 2) media support in vulkan driver is a community effort for Intel GPUs rather than Intel effort and has different implementation comparing to Intel media driver. Overall, I think ffmpeg-vulkan might be an interesting next stepping after enabling torchcodec with ffmpeg-vaapi.

Selection of color conversion algorithm. Following current torchcodec architecture color conversion of decoded output (typically NV12) to RGB24 is needed. In the current implementation I chose to just implement color conversion directly on VAAPI since that's fairly quick and trivial. I believe that's good enough for the first implementation. I do suggest to consider using ffmpeg-vaapi for color space conversion going forward, but this will require additional effort on top of the PR I currently provide. Couple notes on conversion:

  • Current Intel media APIs do not support RGB24 since it is considered suboptimal format (due to odd alignment), so Intel media APIs support only RGB32 formats. For that reason I added slicing of the output RGB32 surface and copying it to the final output surface.
  • Conversion algorithms might give different results. Current tests rely on per-pixel absolute/relative difference. To handle bigger difference in converted output vs. what CUDA has, I used PSNR metric thru torcheval for checks in tests (changed only for Intel GPU).

CC: @scotts, @NicolasHug, @EikanWang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant