Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains an optimization of the procedure for applying changes to Postgresql.
The point is that with a large number of changes on the source (mysql) the replay process may not be able to cope with the flow of incoming events. The situation is exacerbated if the frequently changing replicated table consists of a large number of columns, because there is a block in the query which multiplies all columns of the table with a json object containing all columns with data. This can result in hundreds of gigabytes worth of results which Postgresql has to handle (sort, group, apply some functions). Even high work_mem don't make a significant improvement.
The situation gets even worse if the table has a compound primary key.
This fix eliminates the need to multiply the json object by the number of columns in the table and by the number of rows in the primary key.