Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardware accelerated network packet processing? #129

Open
andrewseddon opened this issue Nov 19, 2024 · 4 comments
Open

Hardware accelerated network packet processing? #129

andrewseddon opened this issue Nov 19, 2024 · 4 comments

Comments

@andrewseddon
Copy link

Is there a way to offload the network packet processing from the CPU? We have a system using around 20 cameras connected to a server via a 100Gbit Intel NIC. Packet processing alone takes up about 50% of the CPU. There are many strategies for offloading this networking processing as detailed here. Does Basler and gst-plugin-pylon support any of them?

I assume NVMM alone does not help here, as the CPU still does the header stripping, which is then fed into GPU memory.

@andrewseddon andrewseddon changed the title Hardware network packet processing? Hardware accelerated network packet processing? Nov 19, 2024
@thiesmoeller
Copy link
Collaborator

Hi @andrewseddon ,

for your intel NIC we have nothing ready yet. ( at the moment we can acellerate on NVIDIA NICs )
RDMA support will come but is not available now.

The mentioned zero copy stragegies listed in your document also has limitations if the camera count gets larger.

To check, if your current setup would be capable to handle full zero-copy, can you show the output

ethtool -l <interface_name>

@andrewseddon
Copy link
Author

Channel parameters for enp129s0f0np0:
Pre-set maximums:
RX: 32
TX: 32
Other: 1
Combined: 32
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 32

This is on our Intel 810 NIC. But we are happy to switch to NVIDIA NIC's if you have a means to accelerate packet processing on them. Would you recommend something like the ConnectX-6 Dx ?

@thiesmoeller
Copy link
Collaborator

One easy option to accelerate on Nvidia NIC is to use https://docs.nvidia.com/networking/display/xliov3402

xlio is a LD_PRELOAD library that will accelerate the UDP receive without code changes.

For our work on rdma.
Do you only need the video data on the GPU? ( Nvidia GPU direct)
or do you also want to access the data in CPU memory?

@andrewseddon
Copy link
Author

Great, I'll give that a go! We could work with either GPU or CPU memory. Right now, we just do H264 compression and have tested CPU and GPU-based compression - both work great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants