-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choose WAL-G backup by start time when cloning and deleting #1
Choose WAL-G backup by start time when cloning and deleting #1
Conversation
Thanks! Will take a look early next week! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This fixes the immediate problem of cleaning up backups based on start time instead of modified time: fcb00b1 Note that this is not at all backwards compatible with However, when working on this I found this: wal-g/wal-g#816 which suggests that there is already built-in support for this. Also, using |
fcb00b1
to
bb77d67
Compare
Here's a new patch that makes use of Diff it with ignore whitespace and it'll look less intimidating. :) |
bb77d67
to
30b4d13
Compare
FWIW the Spilo tests passes with these changes cherry-picked on master and with However, without
This is expected because if the backups are created with Note that this is should only be an issue if you are trying to migrate from |
I think it makes sense to ask if they are planning on removing wal-e support. |
Ugh, running into some issues building the image from this branch, seem to fail on building wal-g 🤔
|
|
Blocked by: wal-g/wal-g#1738 Temp fix to allow for testing while we wait for the fix to be released: 8416ba1 |
I saw that the fix on garry-t's fork was merged into upstream wal-g. Should we switch to using upstream repo, or sync and create a branch on our fork instead? |
Waiting for a release to materialize: wal-g/wal-g#1739 (comment) |
This reverts commit 8416ba1.
A wal-g release has been created with the fix. Upstream Spilo has brought it in. I've made some changes to better conform with Spilo master. If you think the changes look good I'll give it a try to get it merged upstream as well. |
@@ -70,20 +72,25 @@ def choose_backup(backup_list, recovery_target_time): | |||
|
|||
match_timestamp = match = None | |||
for backup in backup_list: | |||
last_modified = parse(backup['last_modified']) | |||
last_modified = parse(backup['start_time' if os.getenv('USE_WALG_RESTORE') == 'true' else 'last_modified']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Simon, should this be finish_time
instead of start_time
of the backup to compare with the recovery_target_time
? Postgres requires that the recovery target "must be after the ending time of the base backup" (https://www.postgresql.org/docs/13/continuous-archiving.html).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up!
The rational behind picking start_time
over finish_time
is that one backup might have been started first but finished second. I'm actually not sure how likely parallel backups are, if they are possible at all. Please let me know if you know more on this matter!
Either way, after having read the documentation you linked to I think you are absolutely correct. It does not make sense to pick a base backup that has not yet finished before the "recovery target".
@@ -57,8 +60,7 @@ def fix_output(output): | |||
started = None | |||
for line in output.decode('utf-8').splitlines(): | |||
if not started: | |||
started = re.match(r'^name\s+last_modified\s+', line) | |||
started = re.match(r'^name\s+last_modified\s+', line) or re.match(r'^name\s+modified\s+', line) | |||
started = re.match(r'^(backup_)?name\s+(last_)?modified\s+', line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this matches up with the fix upstream so that we don't reintroduce the bug: zalando#1026
@@ -43,7 +43,7 @@ ATTEMPT=0 | |||
server_version="-1" | |||
while true; do | |||
[[ -z $wal_segment_backup_start ]] && wal_segment_backup_start=$($WAL_E backup-list 2> /dev/null \ | |||
| sed '0,/^name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') | |||
| sed '0,/^(backup_\)?name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this matches up with the fix upstream so that we don't reintroduce the bug: zalando#1026
@@ -43,7 +43,7 @@ ATTEMPT=0 | |||
server_version="-1" | |||
while true; do | |||
[[ -z $wal_segment_backup_start ]] && wal_segment_backup_start=$($WAL_E backup-list 2> /dev/null \ | |||
| sed '0,/^name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') | |||
| sed '0,/^(backup_\)?name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, missing \
?
| sed '0,/^(backup_\)?name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') | |
| sed '0,/^\(backup_\)?name\s*\(last_\)\?modified\s*/d' | sort -bk2 | tail -n1 | awk '{print $3;}' | sed 's/_.*$//') |
Hey @elastisys/goto-postgresql is the target branch still correct? (as opposed to needing a rebase) |
It should now be based in this branch |
New PR rebasing these changes (since it was opened from @simonklb 's private repo): |
Fixes https://github.com/elastisys/compliantkubernetes-postgresql/issues/241
Also fixes the following upstream issues AFAIK (more investigation needed to verify):
Does not fix when Spilo relies on
wal-g backup-list LATEST
:Breaks cloning from backup that does not include
metadata.json
(for example backup created withwal-e
):