-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport fix libzmq #1584 #135
base: master
Are you sure you want to change the base?
Conversation
Backport fix "fixed zmq assertion in signaler.cpp under ubuntu \#1584"
Anyway to test it? |
I am not sure. This assertion was happening in a not-so-fast system with a very heavy load of messages. I applied (manually): It is a pretty straight back-port. However, after leaving the system for some time, another assertion is hit (immediately on the return of that function):
That is:
So I guess fix is not complete and problems are of a bigger scope... I am not sure if moving to 4.2.X (but I was experiencing similar issues, that's why I downgraded to 4.1.6). For the original assertion, I think You can reject the PR if you wish. But please let me know if there is some fix available somwhere for this, or if I have to implement a retrial to handle these 2 error codes and contribute back. |
Did you end up successfully porting this to 4.1.x? We've run into #1583 on 4.1.x too. I tried building my app with with 4.2.2 but ran into lots of problems right off the bat. Trying to decide now whether to forge ahead with 4.2.x or try to get this working in 4.1.x. |
What problems are you having with 4.2.x? It's API/ABI compatible so it should be a smooth upgrade from 4.1.x |
Hi Luca - thanks for checking with me on this. I ran the the tests for our library that is a layer on top of ZMQ using 4.1.6 plus the patch from this PR and using 4.2.2. I manually applied the change from this PR to 4.1.6 and our tests ran OK though I've only run them a few times. For 4.2.2, I've found that most of the ZMQ-related tests pass, but there are a few that crash each time they run. Here are the backtraces cut off after the first custom class. It's quite possible that the bug is in our code but is latent in 4.1.6.
|
Are those tests by any chance using/creating/closing the same socket from multiple threads? The first abort is because the (internal) message is trying to close is invalid (most likely closed from somewhere else already), the second one has an invalid poll |
Thanks! It turned out to be a race condition that closed sockets from another code path. Due to the race it didn't manifest under 4.1.x. I've fixed it on our side and now all of our tests pass using 4.2.2. I am going to recommend to my team that we upgrade to 4.2.2 to get this patch (for #1584) plus all of the other fixes that came with 4.2.x. |
Good to hear. |
The problem ( I applied the backport and our test suite ran fine. I was going to go ahead and upgrade to 4.2.2, but I ran into an issue that 4.2.2 does not compile with Visual Studio 8 (2005). We primarily use GCC 5.2 on Linux but have some apps using VS8 on Windows, and 4.1.x compiles with VS8. It sucks having to support VS8 but I may end up needing to go with 4.1.x anyway. |
What's the problem with VS8? @somdoron given the test done by @sourcedelica I would say it's enough for merging this, what do you think? |
Here's some of the errors building 4.2.2 using VS8 (2005):
|
I would suggest checking for existing issues on the libzmq repository, and if there's none opening one - sorry but I'm not a windows person so can't help too much with that |
Me neither :) I'll check in the libzmq issues. It looks like most of the errors are localized to a couple places. Thanks a lot for your help! |
@somdoron ping - you think these tests are enough for a merge? |
Backport fix to 4.1.X "fixed zmq assertion in signaler.cpp under ubuntu #1584"