-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inetstack] TCP ctrlblk checks SeqNumber of out-of-order FIN #924
base: dev
Are you sure you want to change the base?
Conversation
if fin == recv_next { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example that show cases this error? self.out_of_order_fin.get()
is indeed supposed to match recv_next
at this point. That is why we have the debug_assert!()
there. With your change we are now failing silently and returning something that is not expected from this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I faced this issue while using our own codebase based on an old version of Demikernel. In an experiment using an open-loop TCP load generator, the behavior of the error case I observed was as follows:
- Some TCP packets are dropped in the middle
- While those dropped packets are recovered, subsequent packets are stored in the out-of-order buffer
- Once the out-of-order buffer becomes full (size limit 16 in Demikernel), more packets are dropped
- Eventually, the client reaches the last packet and FIN, which is stored at the end of the out-of-order buffer
- Thus, the out-of-order buffer has some packets and FIN, and there are some dropped packets in between those out-of-order packets and FIN.
- Without the above condition, the server processes the FIN message while it has not received all prior messages yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but then it looks like the actual bug is in the way that we queue out of order packets, right? If we check in this, I would suggest at least to add a warning statement there. What are your thoughts @iyzhang .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably queue them in the same order that they arrive and it sounds like we might have some dropped packets that arrive after the fin and we should process them first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is actually correct, the debug_assert shouldn't be there. This code predates me, but I didn't recognize the bug when I wrote the comments on line 1120 and 1121, which should now also be removed.
Basically, the bug here is an assumption that if a packet arrives that fills a hole in the sequence space (due to a previously dropped packet) and thus we can now take some subsequent data off of the out-of-order queue and put it on the receive queue, that that data is all the data on the out-of-order queue (and thus we should now process the FIN). But if there were multiple holes in the sequence space (due to multiple drops of non-contiguous packets), this assumption would be false. We could "added_out_of_order" without adding all out of order, and thus shouldn't receive the FIN yet.
This bug could be fixed in other ways, however, that might both be clearer and save a couple of "if" statements for better perf. If "added_out_of_order" was renamed "added_all_out_of_order", and a "added_all_out_of_order = false" was added to the else clause between lines 1099 and 1100, then the original code would go back to being correct, and we could keep the debug_assert to check correctness in debug builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @BrianZill, I think the way of using "added_all_out_of_order" doesn't work for this case.
Let's consider a TCP stream with sequence numbers 1, 2, 3, 4, 5 (where 5 is the FIN).
If the server receives them in out-of-order as 1 3 5 2 4, the behavior will be
- 1 is received
- 3 is stored into the out-of-order queue
- 5 is stored as out-or-order-fin
- 2 is received, and then 3 is taken and received from out-of-order queue (out-of-order queue becomes empty and "added_all_out_of_order" is true at this point)
- 5 (FIN) is processed before receiving 4
so, "added_all_out_of_order" doesn't actually work as we intend here.
What do you think?
if fin == recv_next { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably queue them in the same order that they arrive and it sounds like we might have some dropped packets that arrive after the fin and we should process them first.
This PR fixes a bug in TCP control block that does not properly check the sequence number before processing out-of-order FIN.