-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib: Correctly handle ppoll pfds.events == 0 #17025
Conversation
lib/event.c
Outdated
@@ -1690,6 +1690,35 @@ static void thread_process_io(struct event_loop *m, unsigned int num) | |||
for (i = 0; i < last_read && ready < num; ++i) | |||
thread_process_io_inner_loop(m, num, pfds, &i, &ready); | |||
|
|||
/* | |||
* At this point any fd's that were processed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused about the issue: if a process closes an fd without cancelling the read or write task that uses the fd, that sounds like an application issue that should be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is cancelling the read/writes! It's a different problem. Here's the normal loop of events for read or writes:
program registers with the event system a read/write event. ppoll fds array is modified appropriately
ppoll is run eventually. Read or write event happens. lib/event.c clears the read/write event leaving a .event with 0 and calls the appropriate handler.
Handler wakes up and decides that no more reads or writes should be put on the ppoll fds and does not reregister itself. .
ppoll is entered again and the socket is closed and ppoll sends us a POLLERR which means that something bad happened with the fd, but since nothing got reregistered it puts out the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the array of fds - handler.pfds - isn't removing an fd after each event on the fd. doesn't it look like the fd is left in place, with zero poll bits set? what's clearing the fd from the pfds array, other than the event cancel apis - is it the POLLNVAL path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is exactly what it is doing.
In any event I thought of a different way to do this that removes a expensive walk. testing it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the desire to keep the array stable - in most patterns, an application is going to use a socket ... more than once. but I wonder if it's the copy of the array that's maybe the problem. it's just a brute-force copy right now, but if it were a little smarter, and avoided copying the fds with poll bits of zero, that'd be a sort of compromise? there's code that assumes the 'copy' is an exact copy, so that'd have to be fixed, and maybe that'd be costly.
89f974a
to
b05fa44
Compare
b05fa44
to
4f735b4
Compare
lib/event.c
Outdated
* into that spot, so note it | ||
*/ | ||
if (m->handler.pfds[i].fd == -1 && earlierqueuepos == m->handler.pfdcount) | ||
earlierqueuepos = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just check and set queuepos
here and remove earlierqueuepos
and queuepos_found
… since the loop is gonna continue to run, and if we find the fd we overwrite queuepos
anyway and break
after…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
The frrevent system is spitting out this message in bgpd: 20:40:15 mem1-roc-f2-b1-r5-t2-d4 bgpd[13166]: [XETTR-D5MR0][EC 100663316] Attempting to process an I/O event but for fd: 214(8) no thread to handle this! This is because as each io event is processed, it is possible that a .events is set to 0. This can leave a situation where we ask ppoll to handle anything that happens on a fd with a .events of 0, in this situation ppoll can return POLLERR, which indicates that something bad has happened on the fd. Let's set the ppoll fds.fd value to -1 when there are no more events to be processed. ppoll specifically calls out that it will just skip this particular one. Signed-off-by: Donald Sharp <[email protected]>
4f735b4
to
d11ad98
Compare
The frrevent system is spitting out this message in bgpd: 20:40:15 mem1-roc-f2-b1-r5-t2-d4 bgpd[13166]: [XETTR-D5MR0][EC 100663316] Attempting to process an I/O event but for fd: 214(8) no thread to handle this!
This is because as each io event is processed, it is possible that a .events is set to 0. This can leave a situation where we ask ppoll to handle anything that happens on a fd with a .events of 0, in this situation ppoll can return POLLERR, which indicates that something bad has happened on the fd.
Let's cleanup the pfds after io events are handled that can be cleaned up.