-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XmlRpcServer::countFreeFDs() takes too long inside docker container #1927
Comments
@trainman419 FYI since you added this logic in #1243. |
An example from a bare ubuntu 18.04 container:
|
|
As a workaround, you could run Most of the time in this function is probably spent in the There may also be a faster linux system call to query the number of used or free file descriptors for a process; I did a quick search and didn't find anything, but this is the kind of function that is probably out there somewhere. |
A general question, is there really a reason for this whole change? Was someone actually running out of file descriptors (and they couldn't increase it via ulimit)? |
I think querying file descriptors in smaller batches and terminating early might be a proper fix? |
I think this should be taken care by system (system responsibility), not userland. via accept(), it can detect if it hits the limit for fds checking errno EMFILE, then notify user with error that user could increase the number of the fds. |
sounds reasonable to me |
Would it be possible to merge some kind of option to turn the check off until this gets fixed overall? |
Yes; there were bugs in XmlRpc where it would leak file descriptors and then would go into an infinite loop when it ran out of file descriptors; see https://answers.ros.org/question/250393/rosout-leaks-file-descriptors/ and #914 and the seven PRs with tests and bug fixes listed on #1214 . |
Why was this codebase using that library then? (Was the upstream XmlRpc library/whatever fixed?) |
This is a really, really insensitive question to ask. ROS is over 10 years old, and these bugs were not known when the original authors of ROS chose to use XmlRpc. If you took a few minutes to review the PRs and look at the upstream version of XmlRpc, you'd see that it appears abandoned and has not gotten any updates in many years. ROS has chosen to continue doing maintenance fixes on XmlRpc instead of trying to find a complete replacement of it. I'm here as a volunteer and because I feel some responsibility to the community. I'm happy to help with solutions, but if you're going to focus on placing blame then you're on your own to fix this. |
Nope, not focusing blame, only asking questions that I'd like to understand out of curiosity. |
@wenbin1989 @trainman419 @harlowja anybody willing to contribute the code on this? or already started? |
@wenbin1989 @trainman419 @harlowja Candidate#1: #1928 Candidate#2: #1929 I'd like to go with Candidate#2, since we do not want to create unexpected problems with this minor fix. @wenbin1989 |
@fujitatomoya will do, thanks! |
@fujitatomoya commented in the PR, thanks |
Sweet, will take a look soon. |
Hi there,
We just upgrade to ROS melodic from kinetic recently, and found that node connections take too long to establish, sometimes even over 2 seconds.
Our application is very time sensitive, and this affects a lot.
After some debug, I found that the issue is caused by
countFreeFDs()
function in xmlrpcpp.It's comments say:
But inside docker container, file descriptors number usually is 1048576:
that takes a huge time to query.
docker version we are using is
Any idea how to fix this? Thanks
The text was updated successfully, but these errors were encountered: