High load on single stunner runtime #161

techtruth · 2024-09-24T16:35:48Z

techtruth
Sep 24, 2024

I have a question about using stunner with real time communications (mediasoup) at scale. Any help answering my concerns here would be greatly appreciated.

Must the stunner node receive and then forward traffic to the media server? I believe so.

If this is the case, then won't stunner be a bottleneck for traffic as the scale of the media system increases?

If the traffic of the media exceeds the amount the single stunner node can process, would't that be an issue for the media system?

To achieve scale, would I need to run multiple instances of stunner, and have them each forward traffic to the media servers?

Answered by rg0now

Sep 24, 2024

Just as you say, in order to ingest real-time media traffic into the cluster STUNner indeed must "receive and then forward traffic to the media server". STUNner is essentially a semi-standard TURN server disguised as a Kubernetes gateway, and TURN is essentially just a smarter UDP proxy, so the need for packet forwarding here is a given. Furthermore, TURN sessions are pinned to a particular STUNner pod for their entire lifetime; this is again a given in TURN (we have played with the idea to implement live TURN session migration several times, but no compelling reason to do so has come up so far). It is therefore a valid concern what happens when a single STUNner pod gets overloaded.

Lucki…

View full answer

rg0now · 2024-09-24T19:17:38Z

rg0now
Sep 24, 2024
Maintainer

Just as you say, in order to ingest real-time media traffic into the cluster STUNner indeed must "receive and then forward traffic to the media server". STUNner is essentially a semi-standard TURN server disguised as a Kubernetes gateway, and TURN is essentially just a smarter UDP proxy, so the need for packet forwarding here is a given. Furthermore, TURN sessions are pinned to a particular STUNner pod for their entire lifetime; this is again a given in TURN (we have played with the idea to implement live TURN session migration several times, but no compelling reason to do so has come up so far). It is therefore a valid concern what happens when a single STUNner pod gets overloaded.

Luckily, STUNner supports limitless scale-up and scale-down out-of-the box, so you are always free to add new STUNner pods (new sessions will be equally shared between the old pods and the new one, old sessions remain pinned to the particular STUNner pod they already use) and also to remove working STUNner pods (STUNner pods are clever enough to refuse to terminate until all live TURN allocations go away), and this goes without ever blocking new sessions or losing existing ones. Scalability has always been one of the main features of STUNner. In fact, this is the very reason why we dismissed the standard WebRTC Kubernetes deployment models (like the host-networking hack) from the start: they block scaling.

Even better, you don't need to handle scaling yourself: just plug STUNner into a Kubernetes Horizontal Pod Autoscaling control loop and Kubernetes will make sure to keep the number of running STUNner pods just enough to keep the average CPU utilization in a given range, say, between 30 and 70 percent.

That being said, STUNner may cause considerable CPU load, to the point that, as reported by some of our customers, STUNner's CPU use sometimes outweigh mediasoup's CPU usage proper. This is rarely a problem: CPU in the cloud is cheap, your time to maintain something that you have to manually scale up and down is not, and most often STUNner makes up for the extra CPU by allowing you to use the minimal number of instances needed to handle the actual load (say, 1-2 pods during night hours, dozens of pod during daytime), and all this is easy to automate.

However, if STUNner's CPU consumption is still a worry to you then we have good news: we are about to release an enterprise STUNner distribution that comes with built-in eBPF acceleration, bringing millions of packets per second throughput and microsecond-scale latency per CPU core. If interested, write us an email or drop by at our Discord, we are always happy to chat.

1 reply

techtruth Sep 24, 2024
Author

Thanks much for this detailed explanation. I was hoping to hear exactly what you said there. :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High load on single stunner runtime #161

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

High load on single stunner runtime #161

techtruth Sep 24, 2024

Replies: 1 comment · 1 reply

rg0now Sep 24, 2024 Maintainer

techtruth Sep 24, 2024 Author

techtruth
Sep 24, 2024

Replies: 1 comment 1 reply

rg0now
Sep 24, 2024
Maintainer

techtruth Sep 24, 2024
Author