-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete Velero Backup when user sets the deleteVeleroBackup #67
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -27,6 +27,7 @@ import ( | |||
|
||||
"github.com/go-logr/logr" | ||||
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1" | ||||
apierrors "k8s.io/apimachinery/pkg/api/errors" | ||||
apimeta "k8s.io/apimachinery/pkg/api/meta" | ||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||||
"k8s.io/apimachinery/pkg/types" | ||||
|
@@ -280,6 +281,44 @@ func CheckVeleroBackupLabels(backup *velerov1api.Backup) bool { | |||
return exists && value == constant.ManagedByLabelValue | ||||
} | ||||
|
||||
// DeleteVeleroBackup deletes Velero Backup object based on the NonAdminBackup object name and namespace | ||||
func DeleteVeleroBackup(ctx context.Context, r client.Client, logger logr.Logger, nab *nacv1alpha1.NonAdminBackup) (bool, error) { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't the Deletion logic flawed here ? Why are we using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Our velero backup name is always generated based on https://github.com/migtools/oadp-non-admin/blob/master/internal/common/function/function.go#L115-L145
So we have always VeleroBackup could get deleted only if one matches that entire 3 parts. If we want to add extra protection we could, but then I would argue the UUID of the NonAdminBackup needs to be used from the VeleroBackup to match the UUID of the requestor. It is currently possible, because we store this annotation within VeleroBackup:
Let me know if I should add that protection on top of current name match. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not to just use
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought about that, but with the current implementation I can:
At this stage I have still can remove VeleroBackup from 2 if I create new NAB with the same name as the previous one with flag Using status will disallow me to do so, because 2 will never get created as this object will be there forever until sysadmin removes it. It's because Status can only be updated by controller when 2 is created. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't want to have corner case where for some reason NAB did not got proper update of the Status. To be honest we may want also add some safety feature to only delete VeleroBackup if it has the UID or other linkage to VeleroBackup from NAB object, this imo is separate design ? There needs to be one source of truth here and I've chosen our function, where it could also be Status, so either way. Function is less check if status exists as we have name and only check if VeleroBackup exists. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The NAB will not have a status only when the NAB controller is not running right ? or if the NAB controller is running and the status update call fails ? I think this makes it even more prudent that we check for the status of NAB first:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a time period when the NAB was created, but Backup was not yet created. If we are in that time window there is a problem with getting proper velero backup name from status. If that happens we won't reconcile again and that object will be out of sync. That is the only additional use case which I can think of when it comes to inconsistency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See thats the thing, we are not allowing Delete for such cases, grabbing the name from status would not allow a delete for such cases(because there is no VeleroBackup object in terminal phase) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, more then half of myself tend to agree with you now :-) Thanks. |
||||
veleroBackupName := GenerateVeleroBackupName(nab.Namespace, nab.Name) | ||||
|
||||
logger.V(1).Info(fmt.Sprintf("Attempting to delete VeleroBackup: %s", veleroBackupName)) | ||||
veleroBackup := velerov1api.Backup{} | ||||
|
||||
err := r.Get(ctx, types.NamespacedName{Name: veleroBackupName, Namespace: constant.OadpNamespace}, &veleroBackup) | ||||
|
||||
if err == nil { | ||||
err = r.Delete(ctx, &veleroBackup) | ||||
if err != nil { | ||||
logger.V(1).Info(fmt.Sprintf("Error deleting VeleroBackup: %s", veleroBackupName)) | ||||
return false, err | ||||
} | ||||
} else if apierrors.IsNotFound(err) { | ||||
return false, nil | ||||
} else { | ||||
logger.V(1).Info(fmt.Sprintf("Error while requeting VeleroBackup: %s from Namespace: %s", veleroBackupName, constant.OadpNamespace)) | ||||
return false, err | ||||
} | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think implementation is flawed here we ignore case that that means that Velero Backup was not even retrieved, but we say it got deleted There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a suggestion is to always check if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 100% agree on that, will add that case. |
||||
|
||||
logger.V(1).Info(fmt.Sprintf("Deleted VeleroBackup: %s", veleroBackupName)) | ||||
return true, nil | ||||
} | ||||
|
||||
// DeleteNonAdminBackup deletes Non Admin Backup object | ||||
func DeleteNonAdminBackup(ctx context.Context, r client.Client, logger logr.Logger, nab *nacv1alpha1.NonAdminBackup) error { | ||||
err := r.Delete(ctx, nab) | ||||
if err != nil { | ||||
logger.V(1).Info(fmt.Sprintf("Error deleting NonAdminBackup: %s", nab.Name)) | ||||
return err | ||||
} | ||||
|
||||
logger.V(1).Info(fmt.Sprintf("Deleted NonAdminBackup: %s", nab.Name)) | ||||
return nil | ||||
} | ||||
|
||||
// TODO not used | ||||
|
||||
// GetNonAdminBackupFromVeleroBackup return referenced NonAdminBackup object from Velero Backup object, if no error occurs | ||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,6 +20,7 @@ package controller | |
import ( | ||
"context" | ||
"errors" | ||
"fmt" | ||
"time" | ||
|
||
"github.com/go-logr/logr" | ||
|
@@ -87,6 +88,13 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque | |
return ctrl.Result{}, err | ||
} | ||
|
||
reconcileExit, reconcileErr := r.DeleteVeleroBackup(ctx, rLog, &nab) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Won't we have a case to requeue here ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lint job complained that the requeue was not used in the r.DeleteVeleroBackup function, so I removed it from the logic. |
||
if reconcileExit && reconcileErr != nil { | ||
return ctrl.Result{}, reconcile.TerminalError(reconcileErr) | ||
} else if reconcileExit { | ||
return ctrl.Result{}, nil | ||
} | ||
|
||
reconcileExit, reconcileRequeue, reconcileErr := r.InitNonAdminBackup(ctx, rLog, &nab) | ||
if reconcileRequeue { | ||
return ctrl.Result{Requeue: true, RequeueAfter: requeueTimeSeconds * time.Second}, reconcileErr | ||
|
@@ -118,6 +126,52 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque | |
return ctrl.Result{}, nil | ||
} | ||
|
||
// DeleteVeleroBackup deletes original Velero Backup object when the deleteVeleroBackup Spec is set to true | ||
// | ||
// Parameters: | ||
// | ||
// ctx: Context for the request. | ||
// logrLogger: Logger instance for logging messages. | ||
// nab: Pointer to the NonAdminBackup object. | ||
func (r *NonAdminBackupReconciler) DeleteVeleroBackup(ctx context.Context, logrLogger logr.Logger, nab *nacv1alpha1.NonAdminBackup) (exitReconcile bool, errorReconcile error) { | ||
rLog := log.FromContext(ctx) | ||
logger := logrLogger.WithValues("DeleteVeleroBackup", nab.Namespace) | ||
|
||
// Delete Velero Backup if the deleteVeleroBackup was set to true | ||
if nab.Spec.DeleteVeleroBackup != nil && *nab.Spec.DeleteVeleroBackup { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we check here NAB status? for example, I create a NAB with delete: true, it will be immediately deleted, right? should we only delete finished NABs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so, if someone want to abuse spec then it's free to remove object that has been just created. Similarly to: mkdir /tmp/test && rmdir /tmp/test Nobody stops anyone from doing above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @mateusoliveira43 we should only delete VeleroBackup that are in terminal phase. Currently, I don't think we can delete an in-progress VeleroBackup |
||
deleted, deleteErr := function.DeleteVeleroBackup(ctx, r.Client, rLog, nab) | ||
if !deleted { | ||
// The Velero Object was not found | ||
// Remove NonAdminBackup | ||
// There is potential that Velero Sync controller will recreate Velero Backup | ||
// if one exists in the s3 storage, and this will as well cause NonAdminBackup | ||
// to be recreated by the NAC sync controller, but at this stage we have current | ||
// proper state reflected. | ||
logger.V(1).Info(fmt.Sprintf("Velero Backup not deleted for NAB: %s", nab.Name)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would be interesting to add reason here? something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so this will cause NAB to be in phase Backing Off, and all the info about the referenced velero backup is in the status of that NAB, do we want to duplicate this info within debug logs? |
||
} | ||
|
||
if deleteErr == nil { | ||
// Delete NonAdminBackup object | ||
deleteErrNab := function.DeleteNonAdminBackup(ctx, r.Client, rLog, nab) | ||
return true, deleteErrNab | ||
} | ||
// There was an error while trying to delete Velero Backup | ||
// Set the Phase to BackingOff and Deletion condition to False | ||
_, errUpdateCondition := function.UpdateNonAdminBackupCondition(ctx, r.Client, logger, nab, nacv1alpha1.NonAdminConditionDeletion, metav1.ConditionFalse, "BackupDeleted", "Unable to delete origin Velero Backup") | ||
if errUpdateCondition != nil { | ||
logger.Error(errUpdateCondition, "Unable to set Deletion Condition: True", nameField, nab.Name, constant.NameSpaceString, nab.Namespace) | ||
} | ||
_, errUpdatePhase := function.UpdateNonAdminPhase(ctx, r.Client, logger, nab, nacv1alpha1.NonAdminBackupPhaseBackingOff) | ||
|
||
if errUpdatePhase != nil { | ||
logger.Error(errUpdatePhase, "Unable to set NonAdminBackup Phase: BackingOff", nameField, nab.Name, constant.NameSpaceString, nab.Namespace) | ||
} | ||
return true, deleteErr | ||
} | ||
|
||
return false, nil | ||
} | ||
|
||
// InitNonAdminBackup sets the New Phase on a NonAdminBackup object if it is not already set. | ||
// | ||
// Parameters: | ||
|
@@ -201,7 +255,7 @@ func (r *NonAdminBackupReconciler) ValidateVeleroBackupSpec(ctx context.Context, | |
return true, false, err | ||
} | ||
|
||
updatedStatus, errUpdateStatus := function.UpdateNonAdminBackupCondition(ctx, r.Client, logger, nab, nacv1alpha1.NonAdminConditionAccepted, metav1.ConditionTrue, "BackupAccepted", "backup accepted") | ||
updatedStatus, errUpdateStatus := function.UpdateNonAdminBackupCondition(ctx, r.Client, logger, nab, nacv1alpha1.NonAdminConditionAccepted, metav1.ConditionTrue, "BackupAccepted", "Backup accepted") | ||
if errUpdateStatus != nil { | ||
logger.Error(errUpdateStatus, "Unable to set BackupAccepted Condition: True", nameField, nab.Name, constant.NameSpaceString, nab.Namespace) | ||
return true, false, errUpdateStatus | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also deletes NAB itself
Would be better to inform this in doc comment and change field name to simply
delete
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete isn't enough imo, we need to be more precise here, however I agree that current name isn't ideal as we also remove NAB so we need to find out better one.
How about
DeleteBackup
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
DeleteBackup
sounds good !