-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roscore in 1.13.6 is broken on macos: #1357
Comments
For the record, the immediate reason for the exception is that in
for some reason |
But even with update of
|
According to the docs @trainman419 @NikolausDemmel Can either of you create a pull request and check that it fixes the problem? |
I'm not sure what the appropriate action is in that case? Sticking with the default of 1024? It would be great to get @trainman419's take on this. However, even when reverting #1243, just like when leaving I just checked overlaying ros_comm 1.13.5 (all packages), and with that roscore still works fine. (For it to compile, you need to cherry-pick #1239 and comment out the xmlrpcpp tests as suggested in mikepurvis/ros-install-osx#110 (comment)) |
I'll take a look and try to update tests for the RLIM_INFINITY case. |
It's a bit of a hack, but lunar-devel...trainman419:lunar-devel will probably fix this. I don't have an OSX machine handy and the existing tests don't mock out |
If you haven't already, please also try running the xmlrpcpp tests on OSX. I suspect the existing tests will catch this and may also show that my fix works. |
Thanks for the quick response. I will test and report back. |
@trainman419, unfortunately, I cannot run unit tests. When I try to compile them I get this:
See also: mikepurvis/ros-install-osx#110 (comment) Are the unit tests using google-mock? That is currently not available on osx, see mikepurvis/ros-install-osx#110 (comment) |
@trainman419, the patch works (with a small fix for signedness; turns out With this some very basic local testing confirms that roscore works. I still get these errors when starting roscore:
It might have been there already earlier. Sometimes I get all three errors, sometimes only 2 or 1, sometimes none. Not sure which process those messages come from. |
I'm seeing the same "Failed to contact master" errors as @NikolausDemmel. I guess related to that is the problem that most of the time my callback functions for subscribed topics are not called. I need to restart the CPP node multiple times to be lucky that a simple subscriber program actually receives messages. Even though Since I have a MacBook I could and would be willing to help debugging this issue. Though I would need some guidance where to start looking. |
@NikolausDemmel the error messages you're seeing are almost certainly coming from the I suspect there's some startup timing difference between Linux and OSX, and on OSX either the roscore node starts up more quickly, or the master starts up more slowly, resulting in a missed registration between the node and the master. I suspect this is a result of some of the recent changes to error handling and recovery in xmlrpcpp, but it's hard to confirm because I have not had time to reproduce this locally. |
@trainman419, what would be a good way to help narrow this down for @mpflanzer or myself? Maybe finding a commit where this doesn't happen and then bisecting to identify the commit that introduced this? @mpflanzer, good point about roscpp vs rospy. This would explain, why my simple test with |
Testing with the roscpp_tutorials and comparing to the python tutorials will help confirm that this is an issue with roscpp and not the master in general. If you can get the unit tests to compile correctly, that will be a substantial benefit to confirm that the integration tests work the same way on OSX as they do on Linux. It looks like the errors are a result of trouble with the build files and are not related to gmock or any other external dependencies, so this should be solvable with build changes. I think the oldest change that might have introduced this is #1216 . If you want to attempt a bisect, I would start with that change. |
Ok thanks @trainman419, I'll see what I can do and when. PS: I guess CI for mac OS would be nice in the future. I wounder how hard it would be to setup with travis these days. Might be a bit slow since you need to run build everything from source including dependencies, but that would be ok for testing only the main devel branches to detect regressions.... |
I managed to compile the xmlrpcpp unit tests on macOS. The errors @NikolausDemmel was seeing are most likely caused by different compiler options between building the library and building the test. I had similar issues when linking against gtest (which I'm now linking statically). Anyway, I manually ran all test binaries. Please find the respective outputs attached. In summary:
Are there other tests I should run as well? I'm now going to double check roscpp vs rospy. |
@mpflanzer, cool, thanks for running the tests. What version of the code are you testing? Could you please still push your hacky workaround for the unit tests somewhere (maybe new branch on your fork of ros_comm)? |
I was using this archive: https://github.com/ros-gbp/ros_comm-release/archive/release/lunar/ros_comm/1.13.6-0.tar.gz |
My results from testing different combinations of rospy and roscpp:
"no ok" means that I have to rerun the listener multiple times (~10-50) before messages are received. If messages are received there seem to be no further problems.
|
You can just clone the source repository into a new overlay test-ws, something like:
|
I was finally able to get ros_base from Lunar installed on my laptop. I've pushed a fix for the test linking issues to lunar-devel...trainman419:lunar-devel (I was under-linking some of the test libraries) catkin no longer provides the I'm now able to build the tests, but I can't run them with I had a look at the test results from @mpflanzer , and it looks like the ulimit test is failing to set the ulimit at some point. This might be related to the fact that the ulimit is unlimited on OSX. @mpflanzer I'm on OSX El Capitan and I have not had any trouble yet with the |
Not sure if that helps, but you can always find the build directory for each package in |
You can do
That should build and run all tests. Here are some more infos: http://catkin-tools.readthedocs.io/en/latest/verbs/catkin_build.html#building-and-running-tests edit: Forgot to mention that you might need to abort the
Did you double-check that all tests have actually been build? I had the problem that for some reason |
I'm used to having tests build with the rest of my build, but apparently catkin build doesn't build tests. Now that I'm running |
With these changes it builds for me: trainman419/ros_comm@lunar-devel...mpflanzer:lunar-devel |
I just compiled with with ros_comm version 1.13.4. I don't see any errors when running talker/listener and no need to restart the nodes multiple times. Will try a bisect now. |
According to git bisect the following is the first commit where the talker doesn't receive messages. Some earlier commit already introduced the "Failed to contact master" errors but the connection still worked fine.
|
The first commit showing the "Failed to contact master" errors is:
|
The errors about "Failed to contact master" match my own debugging tonight. It looks like the semantics of the event flags returned by I've also pushed some changes to my branch to build on OSX while maintaining the symbol overriding on Linux. |
Hi, are there any news on this issue? I just ran into it as well. |
@NikolausDemmel @trainman419 @mpflanzer What is the status of this ticket? |
I have a branch with compilation fixes for some of the issues reported in XmlRpcpp: #1392, but according to @mpflanzer it also looks like the changes introduced by @guillaumeautran in #1281 are causing subscriber issues on OSX. Unfortunately I don't have the time to dig further into the issues introduced by #1281, and it's difficult to verify all of my fixes without also fixing those issues. |
Ok, I may have found the issue for the subscriber issues introduced by #1281. OSX does not have the Would someone with access to OSX be able to double check the fix please? |
Nice 👍 I should be able to give it a try tomorrow |
@mpflanzer PR is up #1393, I had to update the |
Fixed by #1393. |
It throws errors like
Seems like it was working fine in 1.13.5. See also discussion at mikepurvis/ros-install-osx#114 . It seems like #1243 introduced the offending lines.
The text was updated successfully, but these errors were encountered: