-
I have a question about using stunner with real time communications (mediasoup) at scale. Any help answering my concerns here would be greatly appreciated. Must the stunner node receive and then forward traffic to the media server? I believe so. If this is the case, then won't stunner be a bottleneck for traffic as the scale of the media system increases? If the traffic of the media exceeds the amount the single stunner node can process, would't that be an issue for the media system? To achieve scale, would I need to run multiple instances of stunner, and have them each forward traffic to the media servers? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Just as you say, in order to ingest real-time media traffic into the cluster STUNner indeed must "receive and then forward traffic to the media server". STUNner is essentially a semi-standard TURN server disguised as a Kubernetes gateway, and TURN is essentially just a smarter UDP proxy, so the need for packet forwarding here is a given. Furthermore, TURN sessions are pinned to a particular STUNner pod for their entire lifetime; this is again a given in TURN (we have played with the idea to implement live TURN session migration several times, but no compelling reason to do so has come up so far). It is therefore a valid concern what happens when a single STUNner pod gets overloaded. Luckily, STUNner supports limitless scale-up and scale-down out-of-the box, so you are always free to add new STUNner pods (new sessions will be equally shared between the old pods and the new one, old sessions remain pinned to the particular STUNner pod they already use) and also to remove working STUNner pods (STUNner pods are clever enough to refuse to terminate until all live TURN allocations go away), and this goes without ever blocking new sessions or losing existing ones. Scalability has always been one of the main features of STUNner. In fact, this is the very reason why we dismissed the standard WebRTC Kubernetes deployment models (like the host-networking hack) from the start: they block scaling. Even better, you don't need to handle scaling yourself: just plug STUNner into a Kubernetes Horizontal Pod Autoscaling control loop and Kubernetes will make sure to keep the number of running STUNner pods just enough to keep the average CPU utilization in a given range, say, between 30 and 70 percent. That being said, STUNner may cause considerable CPU load, to the point that, as reported by some of our customers, STUNner's CPU use sometimes outweigh mediasoup's CPU usage proper. This is rarely a problem: CPU in the cloud is cheap, your time to maintain something that you have to manually scale up and down is not, and most often STUNner makes up for the extra CPU by allowing you to use the minimal number of instances needed to handle the actual load (say, 1-2 pods during night hours, dozens of pod during daytime), and all this is easy to automate. However, if STUNner's CPU consumption is still a worry to you then we have good news: we are about to release an enterprise STUNner distribution that comes with built-in eBPF acceleration, bringing millions of packets per second throughput and microsecond-scale latency per CPU core. If interested, write us an email or drop by at our Discord, we are always happy to chat. |
Beta Was this translation helpful? Give feedback.
Just as you say, in order to ingest real-time media traffic into the cluster STUNner indeed must "receive and then forward traffic to the media server". STUNner is essentially a semi-standard TURN server disguised as a Kubernetes gateway, and TURN is essentially just a smarter UDP proxy, so the need for packet forwarding here is a given. Furthermore, TURN sessions are pinned to a particular STUNner pod for their entire lifetime; this is again a given in TURN (we have played with the idea to implement live TURN session migration several times, but no compelling reason to do so has come up so far). It is therefore a valid concern what happens when a single STUNner pod gets overloaded.
Lucki…