-
-
Notifications
You must be signed in to change notification settings - Fork 626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine Occasionally Hangs #1616
Comments
Two functions timed out here: |
All three of these domains have timeout issues: us1.dev.emm.dell.com:
securemail.vsu.edu:
dell.cz:
|
This is almost guaranteed to be a bug in |
|
GDB traceback for WebHelper process:
Doesn't seem to reveal anything interesting. |
Same for DNSEngine:
|
So it's not a true deadlock. This is good. |
Sending
|
In the DNSEngine, resolution errors always seem to precede the hang:
|
Per @liquidsec this seems to be a bug in the engine itself and not specific to DNS. It is also happening with web requests.
|
In this case, the server very clearly sent a response, which the client never received. This indicates a bug in zmq. |
In this case, it seems like the message never made it to the server. So we have unreliable communication in both directions. |
I've discovered that the default HWM (High-Water-Mark) has recently been changed from unlimited to 1000, and that the default action for
This sounds a lot like our culprit. I'll try increasing the HWMs. |
Still experiencing the 300-second timeouts even after increasing the HWM. The most consistent example, which happens on every single scan, is wayback. The wayback request has a timeout of 20 seconds but never times out properly on the server side. I tested it in isolation and the 20-second timeout works fine. So I'm not sure what's happening there. But I think there's still a deep mystery to be unwrapped:
|
This might be a race condition where The chances of this happening would probably increase as the async event loop gets more congested. To test this theory, I'm going to try raising the timeout interval and see if we get fewer instances of the bug. |
Raising the timeout interval seemed to fix it. 🎉🎉🎉🎉🎉🫠 |
Fixed in #1678. |
@Sh4d0wHunt3rX were the status messages frozen as well? |
Yes, nothing printed on both debug.log or terminal. |
Next time that happens would it be possible to give me ssh access? I need to attach to the process with gdb to see where it hung. |
The text was updated successfully, but these errors were encountered: