Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled error in sys.excepthook #30

Open
pytestbot opened this issue Jul 21, 2017 · 1 comment
Open

Unhandled error in sys.excepthook #30

pytestbot opened this issue Jul 21, 2017 · 1 comment

Comments

@pytestbot
Copy link

After closing the connections we are seeing a few thread errors that seem impossible to get rid of. At the point where they are sent to stderr there is nothing left running and the code has closed the gateway.

Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:

Running execnet with debug show this:

[53488] gw0 [receiver-thread] RECEIVERTHREAD: starting to run
[53488] gw0 sent <Message.CHANNEL_EXEC channelid=1 len=6354>
[53488] gw0 sent <Message.CHANNEL_DATA channelid=1 'platform_information()'>
[53488] gw0 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 ('Ubuntu', '12.04', 'precise')>
[53488] gw0 sent <Message.CHANNEL_DATA channelid=1 'machine_type()'>
[53488] gw0 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 'x86_64'>
[53488] gw0 sent <Message.CHANNEL_DATA channelid=1 'shortname()'>
[53488] gw0 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 'node1'>
[53488] gw0 sent <Message.CHANNEL_DATA channelid=1 len=324>
[53488] gw0 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 None>
[53488] gw0 sent <Message.CHANNEL_DATA channelid=1 len=127>
[53488] gw0 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 None>
[53488] gw0 gateway.exit() called
[53488] gw0 --> sending GATEWAY_TERMINATE
[53488] gw0 --> io.close_write
[53488] gw0 [receiver-thread] EOF without prior gateway termination message
[53488] gw0 [receiver-thread] entering finalization
[53488] gw0 finished receiving
[53488] gw0 [receiver-thread] terminating execution
[53488] gw0 [receiver-thread] closing read
[53488] gw0 [receiver-thread] closing write
[53488] gw0 [receiver-thread] leaving finalization
[53488] gw1 [receiver-thread] RECEIVERTHREAD: starting to run
[53488] gw1 sent <Message.CHANNEL_EXEC channelid=1 len=6354>
[53488] gw1 sent <Message.CHANNEL_DATA channelid=1 'platform_information()'>
[53488] gw1 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 ('CentOS', '6.4', 'Final')>
[53488] gw1 sent <Message.CHANNEL_DATA channelid=1 'machine_type()'>
[53488] gw1 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 'x86_64'>
[53488] gw1 sent <Message.CHANNEL_DATA channelid=1 'shortname()'>
[53488] gw1 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 'node2'>
[53488] gw1 sent <Message.CHANNEL_DATA channelid=1 len=324>
[53488] gw1 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 None>
[53488] gw1 sent <Message.CHANNEL_DATA channelid=1 len=127>
[53488] gw1 [receiver-thread] received <Message.CHANNEL_DATA channelid=1 None>
[53488] gw1 gateway.exit() called
[53488] gw1 --> sending GATEWAY_TERMINATE
[53488] gw1 --> io.close_write
[53488] === atexit cleanup <Group []> ===
[53488] gw0 1 channel.__del__
[53488] gw1 1 channel.__del__
Unhandled exception in thread started by
Error in sys.excepthook:

Original exception was:

It looks as though this is happening in the threading module in Python itself. I am not sure something is possible to eat up these messages.

There is a ticket open in the Python bug tracker that mentions this problem is closed:

http://bugs.python.org/issue1722344

One of the patches that one commenter added looks like what we would need to get rid of them (http://bugs.python.org/file9356/1722344_squelch_exception.patch)

--- /usr/lib/python2.5/threading.py.orig	2008-02-04 13:58:18.000000000 -0600
+++ ./threading.py	2008-02-05 11:55:33.000000000 -0600
@@ -472,34 +472,18 @@
                     _sys.stderr.write("Exception in thread %s:\n%s\n" %
                                       (self.getName(), _format_exc()))
                 else:
-                    # Do the best job possible w/o a huge amt. of code to
-                    # approximate a traceback (code ideas from
-                    # Lib/traceback.py)
-                    exc_type, exc_value, exc_tb = self.__exc_info()
-                    try:
-                        print>>self.__stderr, (
-                            "Exception in thread " + self.getName() +
-                            " (most likely raised during interpreter shutdown):")
-                        print>>self.__stderr, (
-                            "Traceback (most recent call last):")
-                        while exc_tb:
-                            print>>self.__stderr, (
-                                '  File "%s", line %s, in %s' %
-                                (exc_tb.tb_frame.f_code.co_filename,
-                                    exc_tb.tb_lineno,
-                                    exc_tb.tb_frame.f_code.co_name))
-                            exc_tb = exc_tb.tb_next
-                        print>>self.__stderr, ("%s: %s" % (exc_type, exc_value))
-                    # Make sure that exc_tb gets deleted since it is a memory
-                    # hog; deleting everything else is just for thoroughness
-                    finally:
-                        del exc_type, exc_value, exc_tb
+                    # If _sys is missing, then the interpreter is shutting
+                    # down and the thread should no longer exist.  If this
+                    # happens, ignore the error and exit gracefully.
+                    pass
             else:
                 if __debug__:
                     self._note("%s.__bootstrap(): normal return", self)
         finally:
-            self.__stop()
+            # Exceptions will also be raised during stop/delete if the
+            # interpreter is shutting down.  Ignore these as well.
             try:
+                self.__stop()
                 self.__delete()
             except:
                 pass
@pytestbot
Copy link
Author

Original comment by @alfredodeza

We have been able to get rid of these empty errors by closing sys.stderr and sys.stdout in ceph-deploy (see reference issue http://tracker.ceph.com/issues/7594 )

This is how we now call main():

#!python

if __name__ == '__main__':
    try:
        sys.exit(main())
    finally:
        # This block is crucial to avoid having issues with
        # Python spitting non-sense thread exceptions. We have already
        # handled what we could, so close stderr and stdout.
        if not os.environ.get('CEPH_DEPLOY_TEST'):
            try:
                sys.stdout.close()
            except:
                pass
            try:
                sys.stderr.close()
            except:
                pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant