This repo aimed to provide a way to deploy an infrastructure necessary for the manual deployment of a multi-node K8S cluster. Manual deployment is chosen as an exercise.
The goal is to build a highly available cluster with 3 master nodes with stacked etcd
, 3 worker nodes, and a two-node load balancer to distribute traffic to kube-api-servers on master nodes.
IP addresses and hostnames are configured statically, local DNS zone records are updated using TSIG.
Load balancers are configured using Alpine Linux, HAproxy, and Keepalived.
Master and worker nodes are configured using Rocky Linux and Kubeadm.
Hostname | IP address | Type | Cores | Memory |
---|---|---|---|---|
kube-apiserver | 10.1.2.10 | virtual IP | - | - |
nlb-kube-apiserver-a | 10.1.2.11 | loadbalancer | 1 | 512Mib |
lb-kube-apiserver-b | 10.1.2.12 | loadbalancer | 1 | 512Mib |
kube-master-01 | 10.1.2.13 | master node | 2 | 2048Mib |
kube-master-02 | 10.1.2.14 | master node | 2 | 2048Mib |
kube-master-03 | 10.1.2.15 | master node | 2 | 2048Mib |
kube-worker-01 | 10.1.2.21 | worker node | 2 | 4096Mib |
kube-worker-02 | 10.1.2.22 | worker node | 2 | 4096Mib |
kube-worker-03 | 10.1.2.23 | worker node | 2 | 4096Mib |
Load balancer instances are going to be alpine Linux
, so terraform expects a corresponding image to be present on a proxmox node.
To download the expected image execute on the proxmox node:
curl -LO https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/cloud/nocloud_alpine-3.20.3-x86_64-uefi-cloudinit-r0.qcow2
mv nocloud_alpine-3.20.3-x86_64-uefi-cloudinit-r0.qcow2 /var/lib/vz/template/iso/nocloud_alpine-3.20.3-x86_64-uefi-cloudinit-r0.qcow2.img
Terraform automatically provisions and configures two load balancer instances using keepalived
and HAproxy
without any manual intervention.
The summary of automation steps is provided below
Click to expand
Install `haproxy` and `keepalived`:doas apk add keepalived
doas apk add haproxy
Create keepalived
configuration file /etc/keepalived/keepalived.conf
:
global_defs {
router_id ${_INSTANCE}
checker_log_all_failures true
vrrp_version 3
vrrp_no_swap
checker_no_swap
enable_script_security
max_auto_priority
}
vrrp_track_process tracked_process {
process "haproxy"
quorum 1
quorum_max 2
fork_delay 2
terminate_delay 2
}
vrrp_script curl_healthz {
script "/etc/keepalived/check_apiserver.sh"
interval 3
timeout 2
weight 0
rise 2
fall 10
user root
}
vrrp_instance nlb-kube-apiserver {
state BACKUP
interface eth0
track_process {
tracked_process
}
track_script {
curl_healthz
}
check_unicast_src
unicast_peer {
@^nlb-kube-apiserver-a 10.1.2.11
@^nlb-kube-apiserver-b 10.1.2.12
}
unicast_fault_no_peer
virtual_router_id 51
priority ${_RANDOM 100 150}
nopreempt
advert_int 1
virtual_ipaddress {
10.1.2.10/24
}
}
Create curl_healthz
script file /etc/keepalived/check_apiserver.sh
:
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
/usr/bin/curl -sfk --max-time 2 https://localhost:6443/healthz -o /dev/null || errorExit "Error GET https://localhost:6443/healthz"
To configure haproxy
put the following contents to /etc/haproxy/haproxy.cfg
:
global
log stdout format raw daemon info
daemon
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 35s
timeout server 35s
timeout http-keep-alive 10s
timeout check 10s
frontend apiserver
bind *:6443
mode tcp
option tcplog
default_backend kube-apiserver
backend kube-apiserver
option httpchk
http-check connect ssl
http-check send meth GET uri /healthz
http-check expect status 200
mode tcp
balance roundrobin
server kube-master-01 10.1.2.13:6443 check verify none
server kube-master-02 10.1.2.14:6443 check verify none
server kube-master-03 10.1.2.15:6443 check verify none
Now start services
doas rc-update add haproxy && doas rc-service haproxy start
doas rc-update add keepalived && doas rc-service keepalived start
This concludes the configuration of load balancers
VMs for the control plane and worker nodes are using Rocky Linux images. Instructions on how to fetch the corresponding image are available here.
VMs configuration is automated via Cloudinit, and after deployment, VMs are ready for cluster initialization.
NOTE: Rocky Linux image already comes with disabled swap and without firewalld
installed.
The summary of automation steps is provided below
Click to expand
Set SELinux into permissive mode and disable it completely on the next reboot
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
sudo grubby --update-kernel ALL --args selinux=0
Enable necessary kernel modules
$ cat <<EOF |sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
$ sudo modprobe overlay
$ sudo modprobe br_netfilter
Enable necessary system settings
$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
$ sudo sysctl --system
Install the container.d
container runtime and start the service
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo dnf install containerd.io -y
sudo systemctl enable containerd --now
NOTE: The steps below presume that several infrastructural components are in place. To read more look here and here. Alternatively one may populate ssh-agent with credentials to access Proxmox node and declare TF_VAR_
s mentioned in 000_variables.tf
before launching terraform.
Note: All commands (unless stated otherwise) are expected to run from the repo's root directory.
Generate SSH keys:
ssh-keygen -q -C "" -N "" -f $HOME/.ssh/iac
ssh-keygen -q -C "" -N "" -f $HOME/.ssh/vm
Set env variables:
eval $(ssh-agent -s)
export VAULT_ADDR=https://aegis.lan:8200
vault login -method=oidc
Fetch Project's secrets:
source prime_env.sh <<< $(./gen_token.sh < secrets.list)
Generate the API token and sign the SSH key to access Proxmox:
source ./prime_proxmox_secrets.sh $HOME/.ssh/iac
Sign the second SSH key for VM access
./sign_ssh_vm_key.sh $HOME/.ssh/vm alpine,rocky
Initialize providers and execute configuration:
terraform -chdir=110-Infrastructure_terraform init
terraform -chdir=110-Infrastructure_terraform plan
terraform -chdir=110-Infrastructure_terraform apply -parallelism=4
NOTE: the flag -parallelism=4
is necessary to avoid clogging the proxmox node.
NOTE: The following steps need to be executed on all master and worker nodes.
Add Kubernetes repository:
$ cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
Install kubelet
, kubeadm
and kubectl
:
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
Enable kubelet.service
:
sudo systemctl enable kubelet.service --now
NOTE: The kubelet is now restarting every few seconds. That is OK, as it waits in a crash loop for kubeadm to tell it what to do.
Configure containerd
to use systemd cgroups
containerd config default | sed 's/SystemdCgroup = false/SystemdCgroup = true/' | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
Prefetch Kubernetes images
sudo kubeadm config images pull
NOTE: The following steps need to be executed on kube-master-01
node.
create kubeadm-config.yaml
file:
$ sudo mkdir /etc/kubernetes/kubeadm-config
$ cat <<EOF |sudo tee /etc/kubernetes/kubeadm-config/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
# target version of the control plane
kubernetesVersion: 1.30.0
# The cluster name.
clusterName: "test-cluster"
# a stable IP address or a RFC-1123 DNS subdomain (with optional TCP port) for the control plane;
controlPlaneEndpoint: "kube-apiserver.lan:6443"
# configuration for the networking topology of the cluster.
networking:
serviceSubnet: "10.96.0.0/16"
podSubnet: "10.244.0.0/16"
dnsDomain: "cluster.local"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# CIDR range of the pods in the cluster.
clusterCIDR: "10.244.0.0/16"
---
# contains the configuration for the Kubelet
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# DNS domain for this cluster. If set, kubelet will configure all containers
# to search this domain in addition to the host's search domains.
clusterDomain: cluster.local
# list of IP addresses for the cluster DNS server. If set, kubelet will configure all containers
# to use this for DNS resolution instead of the host's DNS servers.
clusterDNS:
- 10.96.0.10
# path to the directory containing local (static) pods to run, or the path to a single static pod file.
staticPodPath: /etc/kubernetes/manifests
# enables client certificate rotation.
# The Kubelet will request a new certificate from the certificates.k8s.io API.
# This requires an approver to approve the certificate signing requests.
rotateCertificates: true
# enables server certificate bootstrap.
# Instead of self signing a serving certificate, the Kubelet will request a certificate from the 'certificates.k8s.io' API.
# This requires an approver to approve the certificate signing requests (CSR).
serverTLSBootstrap: true
# driver kubelet uses to manipulate CGroups on the host (cgroupfs or systemd). Default: "cgroupfs"
cgroupDriver: systemd
# caps the number of images reported in Node.status.images. The value must be greater than -2.
# Note: If -1 is specified, no cap will be applied. If 0 is specified, no image is returned. Default: 50
nodeStatusMaxImages: -1
# when enabled, tells the Kubelet to pull images one at a time.
serializeImagePulls: false
# quantity defining the maximum size of the container log file before it is rotated.
containerLogMaxSize: 50Mi
EOF
Initialize the cluster:
$ sudo kubeadm init --upload-certs --config /etc/kubernetes/kubeadm-config/kubeadm-config.yaml
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join kube-apiserver.lan:6443 --token tj8b72.9joqz1dm182wi19a \
--discovery-token-ca-cert-hash sha256:63f15d741527abc10f298787dcbe22fcddde866893e8881b82a90adcdb93c7ba \
--control-plane --certificate-key 492064f5ddec9ec68b351dbc6f14aa9b7d2b99db3fbf661c8eb630afad3f810c
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join kube-apiserver.lan:6443 --token tj8b72.9joqz1dm182wi19a \
--discovery-token-ca-cert-hash sha256:63f15d741527abc10f298787dcbe22fcddde866893e8881b82a90adcdb93c7ba
NOTE: The following steps need to be executed on kube-master-02
and kube-master-03
nodes.
$ sudo kubeadm join kube-apiserver.lan:6443 --token tj8b72.9joqz1dm182wi19a \
--discovery-token-ca-cert-hash sha256:63f15d741527abc10f298787dcbe22fcddde866893e8881b82a90adcdb93c7ba \
--control-plane --certificate-key 492064f5ddec9ec68b351dbc6f14aa9b7d2b99db3fbf661c8eb630afad3f810c
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
NOTE: The following steps need to be executed on all worker nodes.
sudo kubeadm join kube-apiserver.lan:6443 --token tj8b72.9joqz1dm182wi19a \
--discovery-token-ca-cert-hash sha256:63f15d741527abc10f298787dcbe22fcddde866893e8881b82a90adcdb93c7ba
NOTE: The following steps need to be executed on kube-master-01
node.
Configure kubectl
:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Check nodes
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP
kube-master-01 NotReady control-plane 8m42s v1.30.6 10.1.2.13
kube-master-02 NotReady control-plane 6m3s v1.30.6 10.1.2.14
kube-master-03 NotReady control-plane 5m19s v1.30.6 10.1.2.15
kube-worker-01 NotReady <none> 70s v1.30.6 10.1.2.21
kube-worker-02 NotReady <none> 63s v1.30.6 10.1.2.22
kube-worker-03 NotReady <none> 59s v1.30.6 10.1.2.23
Check Certificate Signing Requests
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-45pxb 9m35s kubernetes.io/kube-apiserver-client-kubelet system:node:kube-master-01 <none> Approved,Issued
csr-4ptb7 6m57s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:tj8b72 <none> Approved,Issued
csr-5t8wt 118s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:tj8b72 <none> Approved,Issued
csr-7bn5r 113s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:tj8b72 <none> Approved,Issued
csr-jjrlv 9m33s kubernetes.io/kubelet-serving system:node:kube-master-01 <none> Pending
csr-n7jgw 9m35s kubernetes.io/kubelet-serving system:node:kube-master-01 <none> Pending
csr-rxxzl 2m3s kubernetes.io/kubelet-serving system:node:kube-worker-01 <none> Pending
csr-sm2d8 6m13s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:tj8b72 <none> Approved,Issued
csr-tb2pm 6m12s kubernetes.io/kubelet-serving system:node:kube-master-03 <none> Pending
csr-v8n4v 112s kubernetes.io/kubelet-serving system:node:kube-worker-03 <none> Pending
csr-wgq9j 6m56s kubernetes.io/kubelet-serving system:node:kube-master-02 <none> Pending
csr-z7hcs 117s kubernetes.io/kubelet-serving system:node:kube-worker-02 <none> Pending
csr-z9pqq 2m4s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:tj8b72 <none> Approved,Issued
Sign certificates
WARNING: the command below will approve all pending requests. Use with caution.
$ kubectl get csr -o json | jq -r '.items[] | select(.status.conditions | not) | .metadata.name' | xargs -I {} kubectl certificate approve {}
certificatesigningrequest.certificates.k8s.io/csr-jjrlv approved
certificatesigningrequest.certificates.k8s.io/csr-n7jgw approved
certificatesigningrequest.certificates.k8s.io/csr-rxxzl approved
certificatesigningrequest.certificates.k8s.io/csr-tb2pm approved
certificatesigningrequest.certificates.k8s.io/csr-v8n4v approved
certificatesigningrequest.certificates.k8s.io/csr-wgq9j approved
certificatesigningrequest.certificates.k8s.io/csr-z7hcs approved
curl -LO https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/calico.yaml
kubectl apply -f calico.yaml
NOTE: It may take some noticeable time to download images and start containers.
Check if the POD network is installed (nodes are in the Ready
state)
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master-01 Ready control-plane 24m v1.30.6
kube-master-02 Ready control-plane 22m v1.30.6
kube-master-03 Ready control-plane 21m v1.30.6
kube-worker-01 Ready <none> 17m v1.30.6
kube-worker-02 Ready <none> 17m v1.30.6
kube-worker-03 Ready <none> 17m v1.30.6
To install the Metrics Server manifest:
curl -L -o metrics-server.yaml https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl apply -f metrics-server.yaml
Check if metrics are available
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
kube-master-01 312m 15% 923Mi 56%
kube-master-02 293m 14% 1098Mi 66%
kube-master-03 285m 14% 858Mi 52%
kube-worker-01 113m 5% 436Mi 12%
kube-worker-02 152m 7% 498Mi 14%
kube-worker-03 121m 6% 458Mi 13%
To be able to create services of type LoadBalancer
there are several options available:
- https://github.com/metallb/metallb
- https://github.com/kube-vip/kube-vip
- https://github.com/openelb/openelb
Let's try MetalLB in L2 operating mode (ARP)
NOTE: When using the L2 operating mode, traffic on port 7946 (TCP & UDP) must be allowed between nodes. Which should already be the case as we do not have a firewall enabled.
Install MetalLB manifest:
curl -LO https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml
kubectl apply -f metallb-native.yaml
Configure IP pool range and its advertisement:
$ cat <<EOF> servicelb-config.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: servicelb-pool
namespace: metallb-system
spec:
addresses:
- 10.1.2.16-10.1.2.20
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: servicelb-l2adv
namespace: metallb-system
spec:
ipAddressPools:
- servicelb-pool
EOF
Apply the manifest
kubectl apply -f servicelb-config.yaml
For a quick test let's create a deployment and expose it as a load balancer service
apiVersion: v1
kind: Service
metadata:
name: test-nginx
spec:
selector:
app: test-nginx
type: LoadBalancer
ports:
- port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-nginx
spec:
replicas: 3
selector:
matchLabels:
app: test-nginx
template:
metadata:
labels:
app: test-nginx
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: 100m
memory: 50Mi
limits:
cpu: 200m
memory: 100Mi
After deploying the manifest check created objects
$ kubectl get services,endpoints
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d18h
service/test-nginx LoadBalancer 10.96.101.139 10.1.2.16 80:30500/TCP 5m28s
NAME ENDPOINTS AGE
endpoints/kubernetes 10.1.2.13:6443,10.1.2.14:6443,10.1.2.15:6443 2d18h
endpoints/test-nginx 10.244.188.131:80,10.244.255.199:80,10.244.84.131:80 5m28s
As upgrading Kubernetes requires moving one minor version at a time let's upgrade our V1.30 cluster to V1.31:
- Backup etcd
- Upgrade a primary control plane node.
- Upgrade additional control plane nodes.
- Upgrade worker nodes
NOTE: the following commands should be executed on a master node
Install etcdctl
corresponding to the current version of etcd static pod
curl -LO https://github.com/etcd-io/etcd/releases/download/v3.5.15/etcd-v3.5.15-linux-amd64.tar.gz
tar xvf etcd-v3.5.15-linux-amd64.tar.gz
sudo mv etcd-v3.5.15-linux-amd64/etcdctl /usr/bin/
sudo mv etcd-v3.5.15-linux-amd64/etcdutl /usr/bin/
rm -rf etcd-v3.5.15-linux-amd64
Switch to the root shell and for convenience let's add the following alias
$ sudo -i
# alias etcdctl='ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt'
Test if etcd is available
# etcdctl member list
7d923aaf0790a670, started, kube-master-01, https://10.1.2.13:2380, https://10.1.2.13:2379, false
9037b32cd3760cb5, started, kube-master-03, https://10.1.2.15:2380, https://10.1.2.15:2379, false
a4393d5e8c29f0f9, started, kube-master-02, https://10.1.2.14:2380, https://10.1.2.14:2379, false
Create and verify the snapshot
# etcdctl snapshot save snapshot-1.30.db
Snapshot saved at snapshot-1.30.db
# etcdutl --write-out=table snapshot status snapshot-1.30.db
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| b4eff0be | 534233 | 1175 | 6.0 MB |
+----------+----------+------------+------------+
Change Kuebernetes repository to V1.31
$ cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
Check that we see a new version
sudo yum list --showduplicates kubeadm --disableexcludes=kubernetes
Upgrade kubeadm
sudo yum install -y kubeadm-'1.31.2' --disableexcludes=kubernetes
Drain the node
kubectl drain kube-master-01 --ignore-daemonsets
Check upgrade plan
sudo kubeadm upgrade plan
Upgrade with preliminary stopping the kube-apiserver process to complete in-flight requests:
sudo pkill -SIGTERM kube-apiserver && sleep 5
sudo kubeadm upgrade apply v1.31.2
Upgrade kubelet
and kubectl
sudo yum install -y kubelet-'1.31.2' kubectl-'1.31.2' --disableexcludes=kubernetes
Restart kubelet
sudo systemctl daemon-reload && sudo systemctl restart kubelet
Uncordon the node
kubectl uncordon kube-master-01
Note: The following steps should be taken for each of the nodes kube-master-02
and kube-master-03
Drain the node
kubectl drain kube-master-02 --ignore-daemonsets
Log in to the node and upgrade the repo to V1.31
$ cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
Upgrade kubeadm
sudo yum install -y kubeadm-'1.31.2' --disableexcludes=kubernetes
Upgrade with stopping the kube-apiserver process.
sudo pkill -SIGTERM kube-apiserver && sleep 5
sudo kubeadm upgrade node
Upgrade kubelet
and kubectl
sudo yum install -y kubelet-'1.31.2' kubectl-'1.31.2' --disableexcludes=kubernetes
Restart kubelet
sudo systemctl daemon-reload && sudo systemctl restart kubelet
Uncordon the node
kubectl uncordon kube-master-02
NOTE: The following steps should be taken for each worker node
Drain the worker node
kubectl drain kube-worker-01 --ignore-daemonsets
Log in to the node and upgrade the repo to V1.31
$ cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
Upgrade kubeadm
, kubelet
and kubectl
sudo yum install -y kubeadm-'1.31.2' kubelet-'1.31.2' kubectl-'1.31.2' --disableexcludes=kubernetes
Upgrade the node
sudo kubeadm upgrade node
Restart kubelet
sudo systemctl daemon-reload && sudo systemctl restart kubelet
Uncordon the node
kubectl uncordon kube-worker-01
NOTE: This is normally not required after the update and the situation below is purely synthetic for training purposes only.
Distribute the backup file we created previously
scp snapshot-1.30.db [email protected]:/home/rocky/snapshot-1.30.db
scp snapshot-1.30.db [email protected]:/home/rocky/snapshot-1.30.db
scp snapshot-1.30.db [email protected]:/home/rocky/snapshot-1.30.db
Delete the test deployment, so we can "restore" it later
kubectl delete svc test-nginx
kubectl delete deploy test-nginx
The high-level restoration plan is :
- stop all API server instances
- restore state in all etcd instances
- restart all API server instances
- restart kubelet, controller-manager, and scheduler components
NOTE: The following step MUST be taken on all the master nodes.
Let's stop cluster components except for kubelet
and kube-proxy
(basically preventing any changes in the cluster)
sudo mkdir /etc/kubernetes/manifests.stopped
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /etc/kubernetes/manifests.stopped/
sudo mv /etc/kubernetes/manifests/etcd.yaml /etc/kubernetes/manifests.stopped/
sudo mv /etc/kubernetes/manifests/kube-controller-manager.yaml /etc/kubernetes/manifests.stopped/
sudo mv /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/manifests.stopped/
Delete the /var/lib/etcd
(optionally backing it up somewhere)
sudo rm -rf /var/lib/etcd
Restoring the same snapshot to all members will override their metadata, which would prevent them from re-join an etcd cluster effectively ending up in a split-brain situation. To avoid such a scenario let's restore with supplying the membership information to the datastore:
## For the kube-master-01
sudo etcdutl snapshot restore snapshot-1.30.db \
--data-dir /var/lib/etcd \
--name kube-master-01 \
--initial-cluster kube-master-01=https://10.1.2.13:2380,kube-master-02=https://10.1.2.14:2380,kube-master-03=https://10.1.2.15:2380 \
--initial-advertise-peer-urls https://10.1.2.13:2380
## For the kube-master-02
sudo etcdutl snapshot restore snapshot-1.30.db \
--data-dir /var/lib/etcd \
--name kube-master-02 \
--initial-cluster kube-master-01=https://10.1.2.13:2380,kube-master-02=https://10.1.2.14:2380,kube-master-03=https://10.1.2.15:2380 \
--initial-advertise-peer-urls https://10.1.2.14:2380
## For the kube-master-03
sudo etcdutl snapshot restore snapshot-1.30.db \
--data-dir /var/lib/etcd \
--name kube-master-03 \
--initial-cluster kube-master-01=https://10.1.2.13:2380,kube-master-02=https://10.1.2.14:2380,kube-master-03=https://10.1.2.15:2380 \
--initial-advertise-peer-urls https://10.1.2.15:2380
Restore etcd
static pod
sudo mv /etc/kubernetes/manifests.stopped/etcd.yaml /etc/kubernetes/manifests/
Verify that etcd
pod started
sudo crictl ps --name etcd
Check that all three members are available
$ sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/peer.crt \
--key=/etc/kubernetes/pki/etcd/peer.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
member list -w table
+------------------+---------+----------------+------------------------+------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------------+------------------------+------------------------+------------+
| dd87136770f03d8 | started | kube-master-02 | https://10.1.2.14:2380 | https://10.1.2.14:2379 | false |
| 161a8569f4c52ec7 | started | kube-master-03 | https://10.1.2.15:2380 | https://10.1.2.15:2379 | false |
| 7d923aaf0790a670 | started | kube-master-01 | https://10.1.2.13:2380 | https://10.1.2.13:2379 | false |
+------------------+---------+----------------+------------------------+------------------------+------------+
Restart other cluster components
sudo mv /etc/kubernetes/manifests.stopped/kube-controller-manager.yaml /etc/kubernetes/manifests/
sudo mv /etc/kubernetes/manifests.stopped/kube-scheduler.yaml /etc/kubernetes/manifests/
sudo mv /etc/kubernetes/manifests.stopped/kube-apiserver.yaml /etc/kubernetes/manifests/
sudo systemctl daemon-reload && sudo systemctl restart kubelet
It won't hurt to restart kubelet
on worker nodes
sudo systemctl daemon-reload && sudo systemctl restart kubelet
Check if test deployment and service are present and operational
$ kubectl get svc,deploy
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3d
service/test-nginx LoadBalancer 10.96.101.139 10.1.2.16 80:30500/TCP 5h38m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/test-nginx 3/3 3 3 5h38m
$ curl http://10.1.2.16:80
<!DOCTYPE html>
<html>
...
</html>
For troubleshooting one may try recreating all the pods:
kubectl delete po --all -A