Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Garbage collector #152

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mateusoliveira43
Copy link
Contributor

@mateusoliveira43 mateusoliveira43 commented Jan 27, 2025

Why the changes were made

Blocked by #139

Fix #145

If for some reason a non admin object is deleted without waiting its finalizer to delete the related Velero object, that object will become a leftover in OADP namespace, which non admin user can not clean up. To avoid these scenarios, garbage collector will run periodically to clean these possible leftover objects.

How to test the changes made

Follow install-from-source testing documentation, pointing this branch and my OADP fork branch feat/nac-garbage-collector

  • Create a DPA with non admin enabled
  • Create some NABs and NARs
  • Delete DPA
  • Remove finalizers from NABs and NARs and delete them
  • Leftover related Backups and Restores will remain in OADP namespace
  • Recreate DPA
  • Leftover related Backups and Restores in OADP namespace should be deleted

Copy link

openshift-ci bot commented Jan 27, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

openshift-ci bot commented Jan 27, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mateusoliveira43

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cmd/main.go Outdated Show resolved Hide resolved
// TODO check UUID as well?
if !function.CheckVeleroBackupAnnotations(&backup) {
logger.V(1).Info("Backup does not have required annotations", constant.NameString, backup.Name)
// TODO delete?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object has NAC labels, but no NAC annotations. What should we do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not happen

Annotations: function.GetNonAdminBackupAnnotations(nab.ObjectMeta),
but if for some reason it does I think it would be logged as warning or INFO not a DEBUG. Alternative if you want this to be visible in OpenShift’s Events tab, you can emit an Event from your controller. This will help cluster admins and users see what’s happening without digging into logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no warning level in this lib we use

what is the decision here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotations are only to ensure ownership reference exists between NA* and Velero objects.
If labels are there it means that they are managed and created by the controller.
If annotations are missing it means that for some different update/patch to the Velero object the ownership reference was removed.

On a second thought I believe this is something that is going to be subject for GC removal.

Even if the Velero Syncs back objects from the S3 storage to the openshift-adp namespace, those annotations are preserved.

Signed-off-by: Mateus Oliveira <[email protected]>
Signed-off-by: Mateus Oliveira <[email protected]>
// TODO duplication in delete logic
// TODO do deletion in parallel?

// TODO delete secret as well?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also check for possible leftover secrets?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, those should be cleaned up by the NABSL controller. On top of that now when I think about that we may want to add OwnerReference between Secret <-> Velero BSL object. This will cause Secret to be removed if the Velero BSL object is gone, so we don't have to implement this part in GC coontroller ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be good

but secret is created prior to bsl, can be added independent of the creation order?

Copy link
Collaborator

@mpryc mpryc Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We can not add or patch the ownerReference on the Secret which already exists. What we could do is to add it vice-versa to the Velero BSL and make Velero BSL having ownerReference to the secret and then add blockOwnerDeletion on the Secret object. This should prevent secret to be removed independently from the BSL, but as always it's not 100% guarantee and there are edge cases where this still may happen.

@mateusoliveira43
Copy link
Contributor Author

/hold

@mpryc
Copy link
Collaborator

mpryc commented Jan 31, 2025

@mateusoliveira43 I am not sure if you will need this part, but implemented in case you will: #162

@mpryc
Copy link
Collaborator

mpryc commented Jan 31, 2025

@mateusoliveira43 @shubham-pampattiwar After going again through our drawings I found that #162 isn't really what we discussed and implemented the #163 which should now satisfy the deletion workflow and GC.

Signed-off-by: Mateus Oliveira <[email protected]>
@@ -82,3 +82,5 @@ require (
)

replace github.com/vmware-tanzu/velero => github.com/openshift/velero v0.10.2-0.20241211163542-fa8f2486175b

replace github.com/openshift/oadp-operator => github.com/mateusoliveira43/oadp-operator v0.0.0-20250131150824-b010f27c07bb
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO remove prior to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a Garbage Collector (GC) controller
2 participants