Skip to content

Commit

Permalink
Merge pull request #6128 from ministryofjustice/runbooks-update
Browse files Browse the repository at this point in the history
update review dates, punctuation and tense modifications
  • Loading branch information
FolarinOyenuga authored Sep 3, 2024
2 parents 1bed990 + e66fc02 commit 51a9ed0
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 22 deletions.
16 changes: 8 additions & 8 deletions runbooks/source/cloud-platform-to-tgw.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
---
title: Adding a route to connect to a TGW
weight: 9000
last_reviewed_on: 2024-02-27
last_reviewed_on: 2024-09-03
review_in: 6 months
---

# Adding a route to connect to the MOJ Transit Gateway

This document is a description of the current Cloud Platform attachment to the MoJ Transit Gateway.
It also explain how to modify the relevant route table to route traffic from Cloud Platform to the MoJ Transit Gateway.
It also explains how to modify the relevant route table to route traffic from Cloud Platform to the MoJ Transit Gateway.

*The scope of this guide is limited on purpose, it only covers the Cloud Platform responsibilities.
The NVVS DevOps team is able to share the Transit Gateways with other AWS accounts.*

## Quick introduction

AWS Transit Gateways allows VPCs, from different accounts or regions, to be connected securely.
AWS Transit Gateways allow VPCs, from different accounts or regions, to be connected securely.
Transit Gateways have their own route-table.
Transit Gateways (TGW) also support connecting to VPNs, AWS Direct Connect and other Transit Gateway
Transit Gateways (TGW) also support connecting to VPNs, AWS Direct Connect and other Transit Gateways.

An important limitation: a TGW can only work with VPCs in the same region it is in. However, TGW from different regions can be peered.
An important limitation: a TGW can only work with VPCs in the same region it is in. However, TGWs from different regions can be peered.

The MoJ Transit Gateway infrastructure is managed here : [github repository]

Expand All @@ -34,11 +34,11 @@ The vpc_attachement is done by creating the resource `aws_ec2_transit_gateway_vp
vpc_id and the private subnet_ids of `live-1` VPC. The NVVS DevOps team then approves the vpc attachement and adds the attachment from their side.
In order to allow the traffic to flow, a new route needs to be added to VPC's route-table for each target VPC.

Example: The Analytical Platform(AP) wants to access the Cloud Platform (CP) VPC.
Example: The Analytical Platform(AP) wants to access the Cloud Platform's (CP) VPC.

- Both are attached the the TGW
- The CP route-table should contain a route with the AP VPC's CIDR block as a destination, but with the TGW ID as a target.
- The same needs to be done on the AP VPC, to route back to CP.
- The same needs to be done on the AP's VPC, to route back to CP.

## Making the change

Expand All @@ -55,7 +55,7 @@ pttp_tgw_destination_cidr_blocks = [
...
]
```
Note: Something similar need to be done on the 'other side', terraform or not.
Note: Something similar needs to be done on the 'other side', terraform or not.

There is a task defined in the `infrastructure-vpc-live-1` [Concourse pipeline] that will apply the new route(s) when merged with the main branch.

Expand Down
2 changes: 1 addition & 1 deletion runbooks/source/manually-apply-namespace.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Manually Plan/Apply Namespace Resources
weight: 180
last_reviewed_on: 2024-02-23
last_reviewed_on: 2024-09-03
review_in: 6 months
---

Expand Down
8 changes: 4 additions & 4 deletions runbooks/source/on-call.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Going on call
weight: 9150
last_reviewed_on: 2024-02-23
last_reviewed_on: 2024-09-03
review_in: 6 months
---

Expand Down Expand Up @@ -71,11 +71,11 @@ The line manager should:

## Contractors

The exact process might depend upon which supplier you are through. Some (or maybe all, we haven’t worked this out yet) supplier systems don’t cope with trying to bill more than 5 days worth in a week (Specifically, "Hays" fall in to this category).
The exact process might depend upon which supplier you are through. Some (or maybe all, we haven’t worked this out yet) supplier systems don’t cope with trying to bill more than 5 days worth in a week (Specifically, "Hays" fall into this category).

If you are not working on a normal work day (Monday–Friday) then you can use this to bill:
If you are not working on a normal work day (Monday–Friday), then you can use this to bill:

1. Ask your contact at the supplier to set you up on Fieldglass for overtime.
2. Log your time
* ??? Log the overtime (More detail once someone has actually done this successfully); or
* ??? Log the overtime (More details once someone has actually done this successfully); or
* If you are billing a ‘work day’ in lieu, note this as a comment in the timesheet
18 changes: 9 additions & 9 deletions runbooks/source/velero.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Velero - Cluster backups and disaster recovery
weight: 601
last_reviewed_on: 2024-02-23
last_reviewed_on: 2024-09-03
review_in: 6 months
---

Expand Down Expand Up @@ -29,7 +29,7 @@ You can verify the Velero server pod is installed and running using the kubectl
kubectl get pods -n velero
```

All backups are are stored in a S3 bucket for 30 days. To view the backup location run the following command:
All backups are are stored in an S3 bucket for 30 days. To view the backup location, run the following command:

```
velero get backup-locations
Expand All @@ -42,21 +42,21 @@ The output will show the bucket name and corresponding prefix.

There are hundreds of reasons why you need to backup parts and/or your full cluster, a few high level reasons are:

- The cluster can not be recovered back to to a fully working state after a failed upgrade
- The cluster can not be recovered back to a fully working state after a failed upgrade
- A user accidentally deletes a resource or namespace
- Losing or corrupted persistent storage (EBS Volume)

### What?

Velero can be used to back up as little as a single resource type to the whole cluster (all namespaces and resources).

StatefulSets including corresponding persistent volumes are backed up together, as unlike stateless applications, it is not possible to easily get an application restored when it has persistent data.
StatefulSets, including corresponding persistent volumes, are backed up together, unlike stateless applications, it is not possible to easily get an application restored when it has persistent data.

Velero **does not** backup the Kubernetes state stored in etcd. It is highly recommenced that a specific backup solution is used for etcd and its relevant certificates in addition to Velero.
Velero **does not** backup the Kubernetes state stored in etcd. It is highly recommended that a specific backup solution is used for etcd and its relevant certificates in addition to Velero.

### How?

It is possible to create manual backups as well as schedules ones.
It is possible to create manual backups as well as scheduled ones.

#### Manual backups

Expand All @@ -82,14 +82,14 @@ To view all stored backups within the cluster, run the following command:
velero get backups
```

It is also possible to restore (and exclude) specific namespaces and resources from a velero-allnamespace backup. Below are examples of restoring a specific namespace and resource.
It is also possible to restore (and exclude) specific namespaces and resources from a velero-allnamespace backup. Below are examples of restoring a specific namespace and resource:

```
velero restore create --from-backup velero-allnamespacebackup --include-namespaces monitoring
velero restore create --from-backup velero-allnamespacebackup --include-resource prometheusrules

```
In a disaster recovery scenario, scheduled backups continue to backup, this may create backups with incorrect config. Instead of stopping individual scheduled jobs, it is recommended to change the storage location access mode to `ReadOnly`. To do this you require the storage location name. the following command will give you the storage location name and current access mode:
In a disaster recovery scenario, scheduled backups continue to backup, this may create backups with incorrect config. Instead of stopping individual scheduled jobs, it is recommended to change the storage location access mode to `ReadOnly`. To do this, you require the storage location name. The following command will give you the storage location name and current access mode:

```
velero get backup-locations
Expand Down Expand Up @@ -127,7 +127,7 @@ velero backup-location create <location-name> --provider aws --bucket <bucket-na

At this point, when you execute `velero backups get` you will see 2 different locations under the storage location column.

If you want to create normal backups with 2 different storage locations configured, you either have to set one of the locatons as default. or add `--volume-snapshot-locations` to your backup command.
If you want to create normal backups with 2 different storage locations configured, you either have to set one of the locatons as default or add `--volume-snapshot-locations` to your backup command.

```
velero backup create <backup-name> --volume-snapshot-locations <location-name>
Expand Down

0 comments on commit 51a9ed0

Please sign in to comment.