You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PR enables Intel GPU support in torchcodec by adding ffmpeg-vaapi decoding and connecting it with pytorch XPU device backend.
There are few items I would like to bring to discussion in this RFC issue.
Selection of ffmpeg backend. There are few which are available for Intel GPU:
ffmpeg-vaapi
ffmpeg-dx11/dx12
ffmpeg-qsv (libvpl based one)
ffmpeg-vulkan
None of the above media backends are preferable in the sense of easier context/memory sharing. Intel media APIs, drivers and libraries are not directly intersect with Intel compute stack. In particular, Intel compute's Unified Shared Memory pointers are note recognized by media APIs and can not be accepted directly. This means that memory sharing between media and compute must be done (at the moment) via lower level APIs such as DMA fds on Linux and NT handles on Windows. This gives OS specific dependency.
I suggest to consider ffmpeg-vaapi for Linux and ffmpeg-dx12 for Windows to enable Intel GPUs. These are backends based on Intel media driver APIs. ffmpeg-qsv is based on a higher level library (libvpl) and does not allow to avoid vaapi/dx dependency since we in any case will need to use these APIs to get to the underlying surface memory. I also think that ffmpeg-vaapi is used by AMD GPUs, thus adding support of VAAPI can help here as well. (See also #444)
ffmpeg-vulkan might be interesting due to eventual cross-vendor capabilities. However 1) media support in vulkan is recently new and I am not sure that all required features are available (for example - color space conversion to RGBA), 2) media support in vulkan driver is a community effort for Intel GPUs rather than Intel effort and has different implementation comparing to Intel media driver. Overall, I think ffmpeg-vulkan might be an interesting next stepping after enabling torchcodec with ffmpeg-vaapi.
Selection of color conversion algorithm. Following current torchcodec architecture color conversion of decoded output (typically NV12) to RGB24 is needed. In the current implementation I chose to just implement color conversion directly on VAAPI since that's fairly quick and trivial. I believe that's good enough for the first implementation. I do suggest to consider using ffmpeg-vaapi for color space conversion going forward, but this will require additional effort on top of the PR I currently provide. Couple notes on conversion:
Current Intel media APIs do not support RGB24 since it is considered suboptimal format (due to odd alignment), so Intel media APIs support only RGB32 formats. For that reason I added slicing of the output RGB32 surface and copying it to the final output surface.
Conversion algorithms might give different results. Current tests rely on per-pixel absolute/relative difference. To handle bigger difference in converted output vs. what CUDA has, I used PSNR metric thru torcheval for checks in tests (changed only for Intel GPU).
That's an RFC issue for the following PR:
The PR enables Intel GPU support in torchcodec by adding ffmpeg-vaapi decoding and connecting it with pytorch XPU device backend.
There are few items I would like to bring to discussion in this RFC issue.
Selection of ffmpeg backend. There are few which are available for Intel GPU:
None of the above media backends are preferable in the sense of easier context/memory sharing. Intel media APIs, drivers and libraries are not directly intersect with Intel compute stack. In particular, Intel compute's Unified Shared Memory pointers are note recognized by media APIs and can not be accepted directly. This means that memory sharing between media and compute must be done (at the moment) via lower level APIs such as DMA fds on Linux and NT handles on Windows. This gives OS specific dependency.
I suggest to consider ffmpeg-vaapi for Linux and ffmpeg-dx12 for Windows to enable Intel GPUs. These are backends based on Intel media driver APIs. ffmpeg-qsv is based on a higher level library (libvpl) and does not allow to avoid vaapi/dx dependency since we in any case will need to use these APIs to get to the underlying surface memory. I also think that ffmpeg-vaapi is used by AMD GPUs, thus adding support of VAAPI can help here as well. (See also #444)
ffmpeg-vulkan might be interesting due to eventual cross-vendor capabilities. However 1) media support in vulkan is recently new and I am not sure that all required features are available (for example - color space conversion to RGBA), 2) media support in vulkan driver is a community effort for Intel GPUs rather than Intel effort and has different implementation comparing to Intel media driver. Overall, I think ffmpeg-vulkan might be an interesting next stepping after enabling torchcodec with ffmpeg-vaapi.
Selection of color conversion algorithm. Following current torchcodec architecture color conversion of decoded output (typically NV12) to RGB24 is needed. In the current implementation I chose to just implement color conversion directly on VAAPI since that's fairly quick and trivial. I believe that's good enough for the first implementation. I do suggest to consider using ffmpeg-vaapi for color space conversion going forward, but this will require additional effort on top of the PR I currently provide. Couple notes on conversion:
torcheval
for checks in tests (changed only for Intel GPU).CC: @scotts, @NicolasHug, @EikanWang
The text was updated successfully, but these errors were encountered: