Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iRODS disconnect() exceptions in production #1542

Closed
mikkonie opened this issue Nov 25, 2022 · 3 comments
Closed

iRODS disconnect() exceptions in production #1542

mikkonie opened this issue Nov 25, 2022 · 3 comments
Assignees
Labels
app: irodsbackend Issue in the irodsbackend app bug Something isn't working ongoing Ongoing issue, needs observation or pending on other projects

Comments

@mikkonie
Copy link
Contributor

mikkonie commented Nov 25, 2022

Occasionally, I've observed exceptions from closing down iRODS connections during __del__(). Perhaps we are trying to close the iRODS connection twice.

Last I checked, we have to manually close the iRODS connection in python-irodsclient because otherwise these end up hanging open on the iRODS server and eventually cause the server to refuse new conns. Perhaps something has changed in the current client version which I didn't spot.

Oddly enough, I have not observed this when developing, only in production.

This is not a huge issue as the connection is closed as intended, but it does pollute logs.

Dump of exception below, sadly it doesn't provice much of a traceback.

@mikkonie mikkonie added bug Something isn't working ongoing Ongoing issue, needs observation or pending on other projects app: irodsbackend Issue in the irodsbackend app labels Nov 25, 2022
@mikkonie mikkonie self-assigned this Nov 25, 2022
@mikkonie
Copy link
Contributor Author

Docker log:

sodar-web_1              | irods.exception.NetworkException: Unable to send message
sodar-web_1              | Exception ignored in: <function Connection.__del__ at 0x7f7f9ca145e0>
sodar-web_1              | Traceback (most recent call last):
sodar-web_1              |   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 85, in __del__
sodar-web_1              |     self.disconnect()
sodar-web_1              |   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 291, in disconnect
sodar-web_1              |     self.send(disconnect_msg)
sodar-web_1              |   File "/usr/local/lib/python3.8/site-packages/irods/connection.py", line 101, in send
sodar-web_1              |     raise NetworkException("Unable to send message")
sodar-web_1              | irods.exception.NetworkException: Unable to send message

@mikkonie mikkonie removed the ongoing Ongoing issue, needs observation or pending on other projects label Nov 25, 2022
@mikkonie mikkonie added this to the v0.13.0 milestone Nov 25, 2022
@mikkonie
Copy link
Contributor Author

It seems to me that upon cleanup(), python-irodsbackend attemps to call disconnect() on a connection in its pool, but can't reach it so the connection simply gets released. Meanwhile, we get some SSL connection errors in the iRODS log.

Looking at /etc/init.d/irods status on the iRODS host in docker, there don't seem to be many open connections hanging around. That would lead me to believe that the connection is indeed closed, but the iRODS client tries to close it again for some reason? Does this also create the error in the iRODS log?

Also, if we don't call cleanup() manually in SODAR, this will result in open connections remaining in iRODS and the server eventually refusing further connections. Just tested and ensured this remains the case on the current version of python-irodsclient in use.

Could try asking the irods developers as this may well be some bug in iRODS or the client.

@mikkonie mikkonie removed this from the v0.13.0 milestone Nov 28, 2022
@mikkonie mikkonie added the ongoing Ongoing issue, needs observation or pending on other projects label Nov 28, 2022
@mikkonie
Copy link
Contributor Author

As expected, these were fixed by #909. A part of the problem was switching to the default sync gunicorn worker, which resulted in the old way of cleaning up connections was not working. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app: irodsbackend Issue in the irodsbackend app bug Something isn't working ongoing Ongoing issue, needs observation or pending on other projects
Projects
None yet
Development

No branches or pull requests

1 participant