Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup implementation: controller, plugins, collection, webhooks #841

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

zerospiel
Copy link
Contributor

@zerospiel zerospiel commented Dec 27, 2024

Backup implementation main part contains the following changes:

  • install velero via flux rather than code
  • adjust roles for the velero chart
  • remove unnecessary controller values
  • rename Backup to ManagementBackup
  • remove Oneshot parameter from the Spec
  • add StorageLocation to both Management and ManagementBackup
  • remove unused types
  • manage schedules in the ctrl instead of velero
  • new source runner for schedules
  • collect the required velero backup spec for the whole backup
  • label Credential references (clusterIdentities) in order to include them in backup
  • backup validation webhook
  • amend backup controller logic with better objects handling
  • fix bug in providertemplates ctrl when ownerreferences are being updated but requeue is not set
  • add custom plugins set via mgmt spec
  • rename k0smotron related provider labels to the correct ones from the k0sproject

To properly test the feature:

  • have k0rdent instance (better with a clusterdeployment ready) with this PR
  • install backupstoragelocation for velero, e.g.
---
apiVersion: v1
data:
  # base64 decoded AK/SK, e.g. 3 lines:
  # [default]
  # aws_access_key_id = <key>
  # aws_secret_access_key = <secret_key>
  cloud: <base64>
kind: Secret
metadata:
  name: cloud-credentials
  namespace: kcm-system
type: Opaque
---
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
  name: default
  namespace: kcm-system
spec:
  config:
    region: <eu-central-1> # region with an s3 bucket
  default: true
  objectStorage:
    bucket: <bucket-name> # bucket name
  provider: aws
  credential:
    name: cloud-credentials
    key: cloud
  • enable backup in the management, set .spec.backup to the something like {enabled: true; schedule: "@every 5m"}
  • wait for the scheduled backup to be created
  • fully drop the k0rdent instance (imitate a disaster)

The restoration process:

  • install clean k0rdent installation (without mgmt/releases/etc)
  • install backup location (see above) to be able to fetch backups from a cloud storage
  • perform velero restore (e.g. velero restore create <name> --existing-resource-policy update --from-backup <backup-name>)
  • wait for the restore in the Completed state (e.g. with velero restore get)
  • done

#814

@zerospiel zerospiel linked an issue Dec 27, 2024 that may be closed by this pull request
@zerospiel zerospiel force-pushed the backup_impl_2 branch 2 times, most recently from 80beb93 to 9694287 Compare December 30, 2024 12:16
@zerospiel zerospiel changed the title Backup impl 2 Backup implementation: controller, plugins, collection, webhooks Dec 30, 2024
@zerospiel zerospiel force-pushed the backup_impl_2 branch 9 times, most recently from 56f0846 to dd516c2 Compare January 6, 2025 16:01
@zerospiel zerospiel force-pushed the backup_impl_2 branch 5 times, most recently from d8b1204 to e581f96 Compare January 8, 2025 17:52
@zerospiel zerospiel marked this pull request as ready for review January 8, 2025 18:07
@zerospiel zerospiel force-pushed the backup_impl_2 branch 4 times, most recently from d95cd1c to fb31e55 Compare January 9, 2025 18:21
internal/controller/management_backup_controller.go Outdated Show resolved Hide resolved
internal/controller/backup/schedule.go Outdated Show resolved Hide resolved
internal/controller/management_backup_controller.go Outdated Show resolved Hide resolved
@zerospiel zerospiel force-pushed the backup_impl_2 branch 4 times, most recently from 86eb134 to 00f57d2 Compare January 13, 2025 11:26
@zerospiel zerospiel force-pushed the backup_impl_2 branch 3 times, most recently from e407f4e to 050229f Compare January 16, 2025 12:30
Copy link
Contributor

@eromanova eromanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general but the backup logic requires final approval from Andrey.

templates/provider/kcm/values.yaml Outdated Show resolved Hide resolved
@zerospiel
Copy link
Contributor Author

Should chart versions be updated because of the change to the provider label values?

@zerospiel zerospiel force-pushed the backup_impl_2 branch 3 times, most recently from ec72808 to 320b164 Compare January 17, 2025 13:18
api/v1alpha1/management_types.go Outdated Show resolved Hide resolved
cmd/main.go Outdated Show resolved Hide resolved
* install velero via flux rather than code
* adjust roles for the velero chart
* remove unnecessary controller values
* rename Backup to ManagementBackup
* remove Oneshot parameter from the Spec
* add StorageLocation to both Management and ManagementBackup
* remove unused types
* manage schedules in the ctrl instead of velero
* new source runner for schedules
* collect the required velero backup spec for the whole backup
* label Credential references (clusterIdentities) in order to include them in backup
* backup validation webhook
* amend backup controller logic with better objects handling
* fix bug in providertemplates ctrl when ownerreferences are being updated but requeue is not set
* add custom plugins set via mgmt spec
* rename k0smotron related provider labels to the correct ones from the k0sproject
* remove velero watcher from mgmtbackup ctrl
* remove velero enabled flag
* remove mgmt.backup spec
* remove webhooks for mgmt/mgmtbackups
* indexer now caches schedules and incomplete backups
* mgmtbackup spec extended with schedule
* mgmtbackup ctrl now does not rely on mgmt object
* simplify mgmtbackup ctrl logic
* set error to mgmtbackup on meta API errors
* fix error with getting next attempt time
@zerospiel zerospiel merged commit e16e8a0 into k0rdent:main Jan 20, 2025
6 checks passed
@zerospiel zerospiel deleted the backup_impl_2 branch January 20, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

implement backups reconciliation
3 participants