Add an option to check for liveness on checkout #47

rdp · 2017-04-05T13:18:17Z

Feature request: AFAICT there is no concept of "is this connection still alive" in the connection pool system.

Adding something like that might be nice. The easiest way (on my head) I can think of to implement this would be "at checkout time" it performs some trivial check like "select 1" or what not, before returning a connection, or something similar.

Thanks!

rdp · 2017-04-12T23:49:42Z

It seems that if I bring a database down then up (to test this) the pool seems to still recover after about 5 tries (which is of course 5 too many) but possibly there already is some logic like this in there, which is good news. Another option (though possibly more complex) might be to re-check connections "on checkin" as well as "periodically if idle" FWIW :)

stakach · 2021-02-13T03:54:16Z

did you end up finding a solution for this @rdp ?

Working with GCP's hosted postgres compatible database and new connections take upwards of 2 seconds to connect meaning we really need to rely on the pool.
So we have the pool configured with max_idle_pool_size == max_pool_size == 50 but if nothing is going on for awhile the connections eventually timeout and broken sockets are returned causing 500 errors in our web app Caused by: (DB::ConnectionLost)

Another issue is the pool doesn't recover from a DB outage - we work around that using health checks, but it's not optimal

rdp · 2021-02-16T05:00:41Z

I thought they had added some background check but not sure .

On Friday, February 12, 2021, Stephen von Takach ***@***.***> wrote: did you end up finding a solution for this @rdp ? Working with GCP's hosted postgres compatible database and new

connections take upwards of 2 seconds to connect meaning we really need to rely on the pool.

So we have the pool configured with max_idle_pool_size == max_pool_size

== 50 but if nothing is going on for awhile the connections eventually timeout and broken sockets are returned causing 500 errors in our web app Caused by: (DB::ConnectionLost)

Another issue is the pool doesn't recover from a DB outage - we work

around that using health checks, but it's not optimal

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.<

https://ci5.googleusercontent.com/proxy/hYjZ3eNBlxtfN6FO96t4QGIQ4HApsSWHmTrIXQ9VPuJe0AI69or7MsaHV4DAG9MrbBK7JsXaYyzKv5m255lIKRNr5JlnSfG-4CvqvD1JOoQp4Dn0amy5tlDJ5Zu7HfZK1lDE0tA8KkS3qfj0yd92geukmNP2IgYuvlEysFP4vHqQsLsLVU5mFs7eX6-zBJq2rgXQBwyUxMJlXAvlLy2nN6q1CulzomuoFygbGa7FEg=s0-d-e1-ft#https://github.com/notifications/beacon/AAADBUBXX7BQDFW436FDYJTS6XZXRA5CNFSM4DGQ5CH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOFZT5TGY.gif>

bcardiff · 2021-02-16T20:40:18Z

The DB pool should recover from db outage within some limits. See https://crystal-lang.org/reference/database/connection_pool.html

My idea would be to have something like Go's ConnMaxLifetime but I would like to hear what would be better.

To ensurer that initial_pool_size young connections are always available in the pool.
To ensurer that at least initial_pool_size young connections are always available in the pool while honoring max_idle_pool_size, max_pool_size.
Upon connection checkout recreate it if the max lifetime was exceeded.

Having a connection check upon checkout could be combined easily with 3, but 1 and 2 could also be extended to incorporate a specific check defined per driver.

stakach · 2021-02-16T22:22:06Z

Keeping the initial_pool_size worth of connections young (not exceeding the max lifetime) makes sense to me
it might be disruptive to recycle them all at once unless a new connection is spun up before an old connection is removed

I think point 3 might be problematic, at least with GCP, as the new connection would take time to be ready.
Might make sense to refresh the connection on return to the pool - i.e. don't return it and instead create a new connection
With points 1 and 2 making sure there are young connections generally available it is probably less of an issue that you occasionally have an older connection if the user configures the max_life_time timeout with a bit of a buffer.

Might be better to call it a target_conn_life_time to hint at this

stakach · 2021-09-03T13:04:43Z

We were recently doing some testing with https://github.com/anykeyh/clear and can confirm that it doesn't recover from a DB disconnection.

The reason for this is that clear handles the DB::ConnectionLost exception however even if the connection is closed? it is returned to the pool... which seems like an oversight.
I think closed connections should be discarded regardless of if ConnectionLost is raised.

Looking at this line: https://github.com/crystal-lang/crystal-db/blob/master/src/db/database.cr#L126
We could do something like

    def using_connection
      connection = self.checkout
      begin
        yield connection
      ensure
        raise ConnectionLost.new(connection) if connection.closed?
        connection.release
      end
    end

or possible something like connection.release unless connection.closed? would work. Not entirely across the internals here

robcole · 2022-04-26T20:05:27Z

@stakach I’ve been running into frequent (5-10 daily) ConnectionLost errors in a new Lucky app, and I’m curious if the work you did in a25f336 resolved this?

I’m testing a patch right now similar to what you quoted above, but from my reading of the code it seems like the call to .release should handle all of this regardless. I’m a bit stumped. 🤔

stakach · 2022-04-26T22:27:51Z

@robcole It should have resolved this, however it's up to the DB implementation to implement retries, for the postgres shard I think you do this via the connection string retry_attempts=4 etc
postgres://user@/test?initial_pool_size=5&retry_attempts=4

that way if a disconnection does occur, your app recovers seamlessly.

Blacksmoke16 mentioned this issue Jul 19, 2024

Proxy: Use connection pools for images iv-org/invidious#4326

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to check for liveness on checkout #47

Add an option to check for liveness on checkout #47

rdp commented Apr 5, 2017

rdp commented Apr 12, 2017 •

edited

Loading

stakach commented Feb 13, 2021

rdp commented Feb 16, 2021 via email

bcardiff commented Feb 16, 2021

stakach commented Feb 16, 2021

stakach commented Sep 3, 2021

robcole commented Apr 26, 2022 •

edited

Loading

stakach commented Apr 26, 2022 •

edited

Loading

Add an option to check for liveness on checkout #47

Add an option to check for liveness on checkout #47

Comments

rdp commented Apr 5, 2017

rdp commented Apr 12, 2017 • edited Loading

stakach commented Feb 13, 2021

rdp commented Feb 16, 2021 via email

bcardiff commented Feb 16, 2021

stakach commented Feb 16, 2021

stakach commented Sep 3, 2021

robcole commented Apr 26, 2022 • edited Loading

stakach commented Apr 26, 2022 • edited Loading

rdp commented Apr 12, 2017 •

edited

Loading

robcole commented Apr 26, 2022 •

edited

Loading

stakach commented Apr 26, 2022 •

edited

Loading