-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to check for liveness on checkout #47
Comments
It seems that if I bring a database down then up (to test this) the pool seems to still recover after about 5 tries (which is of course 5 too many) but possibly there already is some logic like this in there, which is good news. Another option (though possibly more complex) might be to re-check connections "on checkin" as well as "periodically if idle" FWIW :) |
did you end up finding a solution for this @rdp ? Working with GCP's hosted postgres compatible database and new connections take upwards of 2 seconds to connect meaning we really need to rely on the pool. Another issue is the pool doesn't recover from a DB outage - we work around that using health checks, but it's not optimal |
I thought they had added some background check but not sure .
On Friday, February 12, 2021, Stephen von Takach ***@***.***> wrote:
did you end up finding a solution for this @rdp ?
Working with GCP's hosted postgres compatible database and new
connections take upwards of 2 seconds to connect meaning we really need to
rely on the pool.
So we have the pool configured with max_idle_pool_size == max_pool_size
== 50 but if nothing is going on for awhile the connections eventually
timeout and broken sockets are returned causing 500 errors in our web app
Caused by: (DB::ConnectionLost)
Another issue is the pool doesn't recover from a DB outage - we work
around that using health checks, but it's not optimal
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.<
|
The DB pool should recover from db outage within some limits. See https://crystal-lang.org/reference/database/connection_pool.html My idea would be to have something like Go's ConnMaxLifetime but I would like to hear what would be better.
Having a connection check upon checkout could be combined easily with 3, but 1 and 2 could also be extended to incorporate a specific check defined per driver. |
Keeping the I think point 3 might be problematic, at least with GCP, as the new connection would take time to be ready. Might be better to call it a target_conn_life_time to hint at this |
We were recently doing some testing with https://github.com/anykeyh/clear and can confirm that it doesn't recover from a DB disconnection. The reason for this is that clear handles the Looking at this line: https://github.com/crystal-lang/crystal-db/blob/master/src/db/database.cr#L126 def using_connection
connection = self.checkout
begin
yield connection
ensure
raise ConnectionLost.new(connection) if connection.closed?
connection.release
end
end or possible something like |
@stakach I’ve been running into frequent (5-10 daily) ConnectionLost errors in a new Lucky app, and I’m curious if the work you did in a25f336 resolved this? I’m testing a patch right now similar to what you quoted above, but from my reading of the code it seems like the call to .release should handle all of this regardless. I’m a bit stumped. 🤔 |
@robcole It should have resolved this, however it's up to the DB implementation to implement retries, for the postgres shard I think you do this via the connection string that way if a disconnection does occur, your app recovers seamlessly. |
Feature request: AFAICT there is no concept of "is this connection still alive" in the connection pool system.
Adding something like that might be nice. The easiest way (on my head) I can think of to implement this would be "at checkout time" it performs some trivial check like "select 1" or what not, before returning a connection, or something similar.
Thanks!
The text was updated successfully, but these errors were encountered: