Skip to content

Commit

Permalink
Merge pull request #5023 from ministryofjustice/velero-backup-failure
Browse files Browse the repository at this point in the history
docs: ✏️ velero alert
  • Loading branch information
jaskaransarkaria authored Nov 20, 2023
2 parents eec6a76 + b782e92 commit 523f101
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 4 deletions.
26 changes: 24 additions & 2 deletions runbooks/source/disaster-recovery-scenarios.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: Cloud Platform Disaster Recovery Scenarios
weight: 91
last_reviewed_on: 2023-11-20
review_in: 3 months
last_reviewed_on: 2024-05-20
review_in: 6 months
---

# Cloud Platform Disaster Recovery Scenarios
Expand Down Expand Up @@ -266,3 +266,25 @@ terraform plan -target=module.starter_pack

No changes. Infrastructure is up-to-date.
```

### Resolving a PartiallyFailed backup alert

A backup may fail and trigger an alert in `lower-priority-alerts`. inspect the backup job:

```
kubectl get backup -n velero | grep -C 30 YYYYMMDD
```

You identify the failed backup `phase: PartiallyFailed` and there will also by an `errors` field with a count.

To understand the cause of the alert pull out the error messages from the velero pod from kibana:

```
kubernetes.pod_name: velero-<container-id> and log: "level=error"
```

Sometimes the cause of the alert can be genuine, for instance a volume may have been removed (pod restart during a backup):

```
level=error msg="Error backing up item" backup=velero/velero-allnamespacebackup-20231120090023 error="error getting volume info: rpc error: code = Unknown desc = InvalidVolume.NotFound: The volume 'vol-08d317558ab5bd46b' does not exist.\n\tstatus code: 400
```
4 changes: 2 additions & 2 deletions runbooks/source/tips-and-tricks.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: Tips and Tricks
weight: 9200
last_reviewed_on: 2023-10-03
review_in: 3 months
last_reviewed_on: 2023-05-06
review_in: 6 months
---

# Tips and Tricks
Expand Down

0 comments on commit 523f101

Please sign in to comment.