-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection management improvements for multi-database setup #7244
Comments
Seems reasonable, not sure why we decided to include database in the key.
I think as a short-term fix they probably should not count towards max_shared_pool_size, since there is not much to be gained from it. The longer term fix is to avoid a persistent connection per database. Connections from the maintenance daemon are cached because of how frequently the deadlock detector asks for lock graphs. However, we don't actually need a deadlock detector per database. Just one is enough. The other (database-specific) operations in the maintenance daemon are much less frequent and could afford open a new connection every time, and either honour max_shared_pool_size (block if none available) or have a separate shared pool. Thanks for sharing your feedback on this! |
According to the comment, it is there specificaly to support the per-database setting 🙂
Agree, this improvement looks great! I was more focused on limiting the number of connections to workers than their caching, but remove this unnecessary action will be also beneficial.
I would go with a separate shared pool, but introduce an option to configure it as a percentage of
I am actually up to a contribution (we suffered a long enough from this), so if you approve, I may start right away 🙂 |
The main reason we implemented citus.max_shared_pool_size is to prevent the coordinator establishing more connections than any worker can accept (e.g., citus.max_shared_pool_size < max_connections of worker) I think making citus.max_shared_pool_size global (e.g., not per database) sounds useful and aligns with why we added it. I think one important caveat here is that citus.max_shared_pool_size cannot/should not be able to set per-database, otherwise things might become complicated. As far as I know, SIGHUP GUCs -- such as shared_pool_size -- cannot be set per database. When it comes to a separate pool vs some percentage in the pool, I'm very much in favor of the latter. The former could be very hard to implement. I think even the latter is non-trivial to implement, there are bunch of areas needs to be thought of well, such as make sure that this is not broken: https://github.com/citusdata/citus/blob/main/src/backend/distributed/connection/locally_reserved_shared_connections.c or https://github.com/citusdata/citus/blob/main/src/backend/distributed/executor/adaptive_executor.c#L4707 So, please make sure we have good test coverage involving these areas as well as base scenarios with multiple databases.
also, see #3671 as a reference
Great! |
Thanks for the useful insights, @onderkalaci! Before I start, I have one question regarding the deadlock detector:
I assume, the way to solve this is to have one "management" database exactly for such cluster-wide procedures, and that is being implemented now in the pool_2pc branch? |
Yes, I think so. The goal of that infrastructure is to be able to sync and track role & database-related commands across the cluster by funneling them all through the management database (even from non-Citus databases). We could then also decide to run the deadlock detector only in the management database. |
This issue is more of an invitation to a technical discussion than a solid proposal.
Currently, the main issue with having multiple
DATABASE
s in a cluster is connection management which is not adapted to this. The issue can be split into two parts:citus.max_shared_pool_size
and open at leastn * d
connections per worker, where n -- number of nodes in cluster, d -- number ofDATABASE
scitus.max_shared_pool_size
, this limit is applied perDATABASE
, meaning that it will still requiren * d
connections.In order to overcome this, I propose to:
citus.max_shared_pool_size
cluster-wide. Per-database setting then may be used to set a connection quota for a database within the globalcitus.max_shared_pool_size
citus.max_shared_pool_size
.Since those changes make sense mostly for the multi-database setup, they may be enabled by a single GUC, like
citus.multi_database
, or have a separate GUC for the behaviour ofcitus.max_shared_pool_size
, transaction recovery, and distributed deadlock detection.The text was updated successfully, but these errors were encountered: