-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bgpd: add input per-peer priority pkt queue #16517
Conversation
Under heavy loaded system with many connections, and many BGP updates received, the system has issues to maintain BGP connectivity with peers. For instance, the wireshark capture indicates a BGP open packet is received, and is processed by the device many seconds after. Obviously, the remote peer had already flushed the connection.. The proposed fix attempts to create a second stream of buffers reserved for prioritary BGP packets: BGP OPEN, BGP NOTIF, BGP KEEPALIVE. When running the bgp_packet_receive() job, this prioritary fifo is dequeued before consuming the standard fifo queue. Signed-off-by: Philippe Guibert <[email protected]>
@@ -200,8 +203,18 @@ static int read_ibuf_work(struct peer_connection *connection) | |||
stream_set_endp(pkt, pktsize); | |||
|
|||
frrtrace(2, frr_bgp, packet_read, connection->peer, pkt); | |||
|
|||
/* get the type of message */ | |||
stream_forward_getp(pkt, BGP_MARKER_SIZE + 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please write some code that ensures that the pktsize is sufficient to actually read this data.
I have reservations about this: if a system is so loaded that it is running seconds behind, as in the description, should it be prioritizing more connections and more work? what good would come from that? |
I'm pretty ok with making such a change. Although I really don't understand how what @pguibert6WIND has done here will actually solve the problem that is being stated. I do not see any code where we prioritize the special packet types to handle those immediately -vs- handling other peers already knee deep in reading in full tables. Can I be pointed at this decision? Additionally if a peer is already up in bgp_io.c we reset the keepalive timer for any packet we already receive, so I think the proiritizing of keepalive packets make no sense there. All peers eventually end up adding a connect->t_process_packet event, which is per peer and why I don't think this will work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about when Graceful Restart is used. End-of-Rib is sent as BGP_MSG_UPDATE, that mean we do not prioritize those messages?
Typo: the main issue is related to BGP open messages that are processed too lately. |
scenario is a RR with 998 listening peers and 2 speaking peers, advertising 4M L3VPN prefixes. |
You don't explain how this helps, given the way the system works. If I have 998 peers that means I possibly have 998 events that are before the new peers event to handle it. Each of those events can process a whole bunch of packets already. Those events are the ones that are not allowing us to get to the problem, as you are describing. Remember we have a event per peer to handle it's i/o, adding a priority queue for the peer doesn't help when I have to process the other events from other peers first( as that their input queues could be filled). To solve the problem from my perspective you would need to have all priority events handled first before any normal packet processing could occur. This implies some sort of equivalent to zebra's MetaQ handling. Finally, your answer about keepalives makes no sense to me as well. As that FRR currently counts any packets received as triggering the keepalive already. When we made this change I have not seen keepalive problems since then at all. Is this code you have had internally for a long time and are now just forward porting? |
This code is a trial done a few weeks ago. You are totally right that about the per peer queue case. thanks for pointing out this.
About the bgp keepalive, I think this may happen.
|
moving it to draft to decide of the future of this commit:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand the code, this change will not do anything to make the situation better. I am just adding a Request Changes
so that nothing gets added until we resolve the actual problem trying to be resolved
I agree with @mjstapp here. I feel a better approach will be to prioritize the keep-alive messages instead, so that there's no flap of the existing sessions due to high amount of updates. |
Under heavy loaded system with many connections, and many BGP updates received, the system has issues to maintain BGP connectivity with peers.
For instance, the wireshark capture indicates a BGP open packet is received, and is processed by the device many seconds after. Obviously, the remote peer had already flushed the connection..
The proposed fix attempts to create a second stream of buffers reserved for prioritary BGP packets: BGP OPEN, BGP NOTIF, BGP KEEPALIVE. When running the bgp_packet_receive() job, this prioritary fifo is dequeued before consuming the standard fifo queue.