From f4d3dcb875aa5087bc0f70801259d394a5fcc6c8 Mon Sep 17 00:00:00 2001 From: Erik Sundell Date: Tue, 7 May 2024 13:11:24 +0200 Subject: [PATCH] docs: iteration on aws k8s upgrade docs --- docs/howto/upgrade-cluster/aws.md | 144 +++++++++--------------------- 1 file changed, 42 insertions(+), 102 deletions(-) diff --git a/docs/howto/upgrade-cluster/aws.md b/docs/howto/upgrade-cluster/aws.md index e707e1a13b..f8229546ee 100644 --- a/docs/howto/upgrade-cluster/aws.md +++ b/docs/howto/upgrade-cluster/aws.md @@ -3,73 +3,19 @@ # Upgrade Kubernetes cluster on AWS ```{warning} -This upgrade will cause disruptions for users and trigger alerts for -[](uptime-checks). To help other engineers, communicate that your are starting a -cluster upgrade in the `#maintenance-notices` Slack channel and setup a [snooze](uptime-checks:snoozes) -``` - -```{warning} -We haven't yet established a policy for planning and communicating maintenance -procedures to users. So preliminary, only make a k8s cluster upgrade while the -cluster is unused or that the maintenance is communicated ahead of time. +Before proceeding, communicate that your are starting a cluster upgrade in the +`#maintenance-notices` Slack channel. ``` ## Pre-requisites 1. *Install or upgrade CLI tools* - Install required tools as documented in [](new-cluster:prerequisites), - and ensure you have a recent version of eksctl. - - ```{warning} - Using a modern version of `eksctl` has been found important historically, make - sure to use the latest version to avoid debugging an already fixed bug! - ``` - -2. *Consider changes to `template.jsonnet`* - - The eksctl config jinja2 template `eksctl/template.jsonnet` was once used to - generate the jsonnet template `eksctl/$CLUSTER_NAME.jsonnet`, that has been - used to generate an actual eksctl config. - - Before upgrading an EKS cluster, it could be a good time to consider changes - to `eksctl/template.jsonnet` since this cluster's jsonnet template was last - generated, which it was initially according to - [](new-cluster:generate-cluster-files). - - To do this first ensure `git status` reports no changes, then generate new - cluster files using the deployer script, then restore changes to everything - but the `eksctl/$CLUSTER_NAME.jsonnet` file. + Install required tools as documented in [](new-cluster:aws-required-tools), + and ensure you have the latest version of `eksctl`. Without it you may be + unable to use a modern versions of k8s. - ```bash - export CLUSTER_NAME= - export CLUSTER_REGION= - export HUB_TYPE= - ``` - - ```bash - # only continue below if git status reports a clean state - git status - - # generates a few new files - deployer generate dedicated-cluster aws --cluster-name=$CLUSTER_NAME --cluster-region=$CLUSTER_REGION --hub-type=$HUB_TYPE - - # overview changed files - git status - - # restore changes to all files but the .jsonnet files - git add *.jsonnet - git checkout .. # .. should be the git repo's root - git reset - - # inspect changes - git diff - ``` - - Finally if you identify changes you think should be retained, add and commit - them. Discard the remaining changes with a `git checkout .` command. - -3. *Learn how to generate an `eksctl` config file* +2. *Learn/recall how to generate an `eksctl` config file* When upgrading an EKS cluster, we will use `eksctl` extensively and reference a generated config file, `$CLUSTER_NAME.eksctl.yaml`. It's generated from the @@ -90,14 +36,26 @@ cluster is unused or that the maintenance is communicated ahead of time. ## Cluster upgrade -### 1. Ensure in-cluster permissions +### 1. Acquire and configure AWS credentials + +Refer to [](cloud-access:aws) on how to do this. -The k8s api-server won't accept commands from you unless you have configured -a mapping between the AWS user to a k8s user, and `eksctl` needs to make some -commands behind the scenes. +```{warning} +If you use `deployer use-cluster-credentials` in a terminal where you configured +these credentials, you will no longer act as your own AWS user but the +`hub-deployer-user` that doesn't have the relevant permissions to work with +`eksctl`. +``` -This mapping is done from a ConfigMap in kube-system called `aws-auth`, and -we can use an `eksctl` command to influence it. +### 2. Ensure in-cluster permissions + +`eksctl` may need to use your AWS credentials to act within the k8s cluster +(`kubectl drain` during node pool operations for example), but the k8s +api-server won't accept commands from your AWS user unless you have configured a +mapping between it to a k8s user. + +This mapping is done from a ConfigMap in kube-system called `aws-auth`, and we +should use an `eksctl` command to influence it. ```bash eksctl create iamidentitymapping \ @@ -108,19 +66,13 @@ eksctl create iamidentitymapping \ --group=system:masters ``` -### 2. Acquire and configure AWS credentials - -Visit https://2i2c.awsapps.com/start#/ and acquire CLI credentials. - -In case the AWS account isn't managed there, inspect -`config/$CLUSTER_NAME/cluster.yaml` to understand what AWS account number to -login to at https://console.aws.amazon.com/. - -Configure credentials like: - -```bash -export AWS_ACCESS_KEY_ID="..." -export AWS_SECRET_ACCESS_KEY="..." +```{note} +Configuring this mapping only makes `eksctl` ready to work against the k8s +cluster - not `kubectl`. While you could use the `eksctl utils write-kubeconfig` +command for this, its often more convenient to open a _new terminal tab_ and use +`deployer use-cluster-credentials $CLUSTER_NAME` as that won't clutter your +kubeconfig and will also allow you to inspect what goes on while `eksctl` is +working. ``` ### 3. Upgrade the k8s control plane one minor version @@ -129,7 +81,7 @@ export AWS_SECRET_ACCESS_KEY="..." The k8s control plane can only be upgraded one minor version at the time.[^1] ``` -#### 3.1. Update the cluster's version field one minor version. +#### 3.1. Update the cluster's version field one minor version In the cluster's config file there should be an entry like the one below, where the version must be updated. @@ -138,7 +90,7 @@ where the version must be updated. { name: "openscapeshub", region: clusterRegion, - version: '1.27' + version: "1.29", } ``` @@ -199,32 +151,20 @@ doesn't break the three minor versions rule. Then, you can upgrade the node groups directly from 1.25 to 1.28 making only one upgrade on the node groups instead of three. -### 5. Upgrade node groups version until it matches the k8s control plane +### 5. Upgrade node groups version -```{important} -Per step 4 above, you can upgrade the version of the node groups maximum -three versions at once, for example from 1.25 to 1.28 directly if the -control plane's version allows it. - -If after one such upgrade, the node groups version is still behind -the k8s control plane, you will need to repeat the node upgrade process -until it does. -``` +To upgrade user node groups and core node groups is disruptive as described in +FIXME. -To upgrade (unmanaged) node groups, you delete them and then add them back in. When -adding them back, make sure your cluster config's k8s version is what you -want the node groups to be added back as. +If there are no users on the node groups you are to update, you can delete them +and then add them back, but otherwise you should add a new node pool, taint the +old, and later delete the old when its empty. #### 5.1. Double-check current k8s version in the config -Up until this step, you should have updated the control plane's -version at least once but for maximum of three times. So you shouldn't -need to update it. - -However, it is worth double checking that the k8s version that is in -the config file is: -- not ahead of the current k8s control plane version, as this will - influence the version of the node groups. +Double check that the k8s version the config file is: +- not ahead of the current k8s control plane version, as this will influence the + version of the node groups. - not **more than three minor versions** than what the version of node groups was initially