K3S Kubernetes Cluster at home automated with Ansible and ArgoCD
This is an educational project to build a hybrid x86/ARM Kubernetes cluster at home, using Raspberry Pi and refurbished x86 mini PCs, learn to deploy basic kubernetes services and automate its deployment and configuration applying IaC (infrastructure as a code) and GitOps methodologies.
The entire process for creating this cluster at home, from cluster design and architecture to step-by-step manual configuration guides, has been documented and it is published in the project website: https://picluster.ricsanfre.com.
This repository contains all source code used to automate all manual tasks described in the documentation: Cloud-init's configuration files, Ansible's source code (playbooks/roles), and packaged Kubernetes applications (helm and kustomize) to be deployed using ArgoCD.
Since its deployment is completely automated, the cluster can be re-deployed in minutes as many times as needed for testing new cluster configurations, new software versions or just take you out of any mesh you could cause playing with the cluster.
The scope of this project is to build a hybrid x86/ARM kubernetes cluster at home, using low cost Raspeberry PIs and old refurbished mini PCs, and automate its deployment and configuration applying IaC (infrastructure as a code) and GitOps methodologies with tools like Ansible, cloud-init and Argo CD.
As part of the project, the goal is to use a lightweight Kubernetes flavor based on K3S and deploy cluster basic services such as: 1) distributed block storage for POD's persistent volumes, LongHorn, 2) backup/restore solution for the cluster, Velero and Restic, 3) service mesh architecture, Linkerd, and 4) observability platform based on metrics monitoring solution, Prometheus, logging and analytics solution, EFK+LG stack (Elasticsearch-Fluentd/Fluentbit-Kibana + Loki-Grafana), and distributed tracing solution, Tempo.
The following picture shows the set of opensource solutions used so far in the cluster, which installation process has been documented and its deployment has been automated with Ansible/ArgoCD:
Name | Description | |
---|---|---|
Ansible | Automate OS configuration, external services installation and k3s installation and bootstrapping | |
ArgoCD | GitOps tool for deploying applications to Kubernetes | |
Cloud-init | Automate OS initial installation | |
Ubuntu | Cluster nodes OS | |
K3S | Lightweight distribution of Kubernetes | |
containerd | Container runtime integrated with K3S | |
Flannel | Kubernetes Networking (CNI) integrated with K3S | |
CoreDNS | Kubernetes DNS | |
HA Proxy | Kubernetes API Load-balancer | |
Metal LB | Load-balancer implementation for bare metal Kubernetes clusters | |
Ingress NGINX | Kubernetes Ingress Controller | |
Traefik | Kubernetes Ingress Controller (alternative) | |
Linkerd | Kubernetes Service Mesh | |
Longhorn | Kubernetes distributed block storage | |
Minio | S3 Object Storage solution | |
Cert-manager | TLS Certificates management | |
Hashicorp Vault | Secrets Management solution | |
External Secrets Operator | Sync Kubernetes Secrets from Hashicorp Vault | |
Keycloak | Identity Access Management | |
OAuth2.0 Proxy | OAuth2.0 Proxy | |
Velero | Kubernetes Backup and Restore solution | |
Restic | OS Backup and Restore solution | |
Prometheus | Metrics monitoring and alerting | |
Fluentd | Logs forwarding and distribution | |
Fluentbit | Logs collection | |
Loki | Logs aggregation | |
Elasticsearch | Logs analytics | |
Kibana | Logs analytics Dashboards | |
Tempo | Distributed tracing monitoring | |
Grafana | Monitoring Dashboards |
Even whe the premise is to deploy all services in the kubernetes cluster, there is still a need for a few external services/resources. Below is a list of external resources/services and why we need them.
Provider | Resource | Purpose | |
---|---|---|---|
Letsencrypt | TLS CA Authority | Signed valid TLS certificates | |
IONOS | DNS | DNS and DNS-01 challenge for certificates |
NOTE: These resources are optional, the homelab still works without them but it won't have trusted certificates
Alternatives:
-
Use a private PKI (custom CA to sign certificates).
Currently supported. Only minor changes are required. See details in Doc: Quick Start instructions.
-
Use other DNS provider.
Cert-manager / Certbot, which are the tools that automatically obtain certificates from Let's Encrypt, can be configured to use other DNS providers. It will need further modifications in the way cert-manager application is deployed (new providers and/or webhooks/plugins might be required).
Currently only acme issuer (letsencytp) using IONOS as dns-01 challenge provider is configured. Check list of supported dns01 providers.
There is another list of services that I have decided to run outside the kuberentes cluster selfhosting them..
External Service | Resource | Purpose | |
---|---|---|---|
Minio | S3 Object Store | Cluster Backup | |
Hashicorp Vault | Secrets Management | Cluster secrets management |
Minio backup servive is hosted in a VM running in Public Cloud, using Oracle Cloud Infrastructure (OCI) free tier.
Vault service is running in gateway
node, since Vault kubernetes authentication method need access to Kuberentes API, I won't host Vault service in Public Cloud.
Home lab architecture, showed in the picture below, consist of a Kubernetes cluster of ARM (Rasbperry PI) and x86 (HP elitedesk 800 G3 mini PCs) nodes and a firewall, built with another Raspberry PI, to isolate cluster network from your home network.
See further details about the architecture and hardware in the documentation
You can browse more information about Pi Cluster Project on https://picluster.ricsanfre.com/.
The content of this website and the source code to build it (Jekyll static based website) are also stored in this repo: /docs
folder.
Check out the documentation Quick Start guide to know how to use and tweak cloud-init files (/cloud-init
folder), Ansible playbooks (/ansible
folder) and packaged Kubernetes applications ( /argocd
folder) contained in this repository, so you can use in for your own homelab.
This project was started in June 2021 by Ricardo Sanchez