-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change psql concurrency from autocommit to serializable. #1190
base: master
Are you sure you want to change the base?
Conversation
Autocommit is giving concurrency errors in PostgreSQL when operations are sent in parallel. Using serializable transactions seems to fix it. For ex: ERROR: deadlock detected DETAIL: Process 176184 waits for ShareLock on transaction 15529683; blocked by process 191002. Process 191002 waits for ShareLock on transaction 15529684; blocked by process 178678. Process 178678 waits for ExclusiveLock on tuple (1386,16) of relation 43815 of database 16391; blocked by process 176184. Process 176184: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.240/28' Process 191002: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.11.0/28' Process 178678: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.208/28' HINT: See server log for query details. CONTEXT: while locking tuple (1386,16) in relation "ip_net_plan" SQL statement "UPDATE ip_net_plan SET children = (SELECT COUNT(1) FROM ip_net_plan WHERE vrf_id = OLD.vrf_id AND iprange(prefix) << iprange(old_parent.prefix) AND indent = old_parent.indent+1) WHERE id = old_parent.id" PL/pgSQL function tf_ip_net_plan__prefix_iu_after() line 92 at SQL statement STATEMENT: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.240/28'
nipap/nipap/backend.py
Outdated
@@ -771,7 +771,7 @@ def _connect_db(self): | |||
while True: | |||
try: | |||
self._con_pg = psycopg2.connect(**db_args) | |||
self._con_pg.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT) | |||
self._con_pg.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line too long (98 > 79 characters)
I have more feedback about this change. It works if concurrency is low, but when concurrency increases, the deadlocks cause timeouts. I think the actual problem is harder to fix: there are concurrency issues in the database code. Unfortunately a fix for this is outside my reach. So far we have mitigated the problem by adding retries in the frontend code. This hides the problem from our users. |
Here is a different patch that undoes the previous change and locks the tables for the deletes. We have tested it for one week and no complains so far. Performance is around 50% lower but no more deadlocks for us. |
Psql Autocommit is giving concurrency errors in PostgreSQL when operations are sent in parallel. Using serializable transactions seems to fix it.
Example:
This problem is happening pretty often in our deployment (several times per day).
I can easily reproduce it with this simple (and partial) Python code.
list_of_prefixes
is a file with a prefix on each line:I guess this patch only hides the problem, but changing every query/transaction/function would not be as easy.