Skip to content
WALDEMAR KOZACZUK edited this page Aug 1, 2022 · 26 revisions

The OSv networking stack originates from FreeBSD as of circa 2013 but has since been heavily modified to implement Van Jacobson's "network channels" design, to reduce the number of locks and lock operations. For more theory and high-level design details please read the "Network Channels" chapter of the OSv paper.

This Wiki instead, focuses on the code and where these design ideas are implemented. It still touches just a tip of "the iceberg" which is the code of the networking stack located mostly under the bsd/ subtree.

Slow Path vs Fast Path

One can see many references in the code to both "fast path" and "slow path". To understand both, one can start looking at this code in virtio-net driver (there is a similar code in the vmxnet3 driver):

void net::receiver()
{
...
  bool fast_path = _ifn->if_classifier.post_packet(m_head);
  if (!fast_path) {
      (*_ifn->if_input)(_ifn, m_head);
  }
...
}

In essence, this code is called to process incoming data (RX) from the network card and it tries to "push" the resulting mbuf via the network channel (fast-path). If that fails it falls back to the if_input from the FreeBSD way of doing things.

The if_classifier, a member of the struct ifnet describing network interface and defined in if_var.h, is an instance of the class classifier. The method post_packet() used in the code above, is part of the 'producer' interface and its role is to identify or classify if mbuf in question has some corresponding net channel and if so push the mbuf on that net channel and wake consumers of the net channel. So the network card driver, virtio-net in this example, is a "producer" in the context of the net channel and threads blocked when calling send, recv and poll are "consumers". Also, an instance of a net channel corresponds to a single TCP connection.

Here is an example of the "successful" fast path traversal:

0xffff8000015ff040 virtio-net-rx    0        21.143180806 net_packet_in        b'IP truncated-ip - 14 bytes missing! 192.168.122.1.36394 > 192.1
68.122.15.8000: Flags [P.], seq 2688002:2688090, ack 2893961834, win 65535, length 88'
  log_packet_in(mbuf*, int) core/net_trace.cc:143
  classifier::post_packet(mbuf*) core/net_channel.cc:133
  virtio::net::receiver() drivers/virtio-net.cc:542
  std::_Function_handler<void (), virtio::net::net(virtio::virtio_device&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) drivers/virtio-net.cc:243
  __invoke_impl<void, virtio::net::net(virtio::virtio_device&)::<lambda()>&> /usr/include/c++/11/bits/invoke.h:61
  __invoke_r<void, virtio::net::net(virtio::virtio_device&)::<lambda()>&> /usr/include/c++/11/bits/invoke.h:154
  _M_invoke /usr/include/c++/11/bits/std_function.h:290
  sched::thread::main() core/sched.cc:1267
  thread_main_c arch/x64/arch-switch.hh:325
  thread_main arch/x64/entry.S:116

Now, how does the post_packet() exactly "classify" the packet? Under the hood, it calls the method classify_ipv4_tcp(), which in turn first verifies if the packet belongs in the "fast path" category meaning more-less:

  • is it an IP packet?
  • does it carry a TCP payload?
  • is the underlying TCP connection in the right state - not TH_SYN nor TH_FIN nor TH_RST.

The last condition effectively means that only sockets in the state - ESTABLISHED, CLOSE_WAIT, FIN_WAIT_2, and TIME_WAIT - would "participate" the fast path traversal. In other words, the fast path only plays a role when a TCP connection is established and the slow path plays a role during establishing and tear-down of a TCP connection.

The post_packet() pushes an mbuf onto the net channel only if one exists. But when does a net channel get created? The net channel gets constructed by tcp_setup_net_channel() and destroyed by tcp_teardown_net_channel() or tcp_free_net_channel(). The former gets called when a TCP connection gets established in tcp_do_segment() here and there. The tcp_teardown_net_channel() gets called by tcp_do_segment() when socket in ESTABLISHED state transitions to CLOSE_WAIT one, and an established socket is closed is in tcp_usr_close() and tcp_usrclosed(). The tcp_free_net_channel() on other hand, gets called by tcp_discardcb() when the process of TCP connection closing begins in other TCP state machine cases.

The tcp_setup_net_channel() is key as it binds the "consumers" of a net channel by calling add_poller() and add_epoll. It also registers a new net channel in the RCU hashtable kept as part of the classifier.

Coming back to the original code, if the 'fast path' fails when post_packet() returns false, the if_input function - "slow path" is called.

To conclude, fast path because it directly calls net channel rather than traversing all traditional stack call paths that involve many locks - slow path.

Top-down Direction

Bottom-up Direction

Clone this wiki locally