Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev-docs: Helm chart for full L3 VPN connectivity #2620

Merged
merged 2 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 60 additions & 12 deletions dev-docs/howto/vpn/helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,35 +2,83 @@

This Helm chart deploys a VPN server to your Constellation cluster.

## Installation
## Prerequisites

1. Create and populate the configuration.
* Constellation >= v2.14.0
* A publicly routable VPN endpoint on premises that supports IPSec in IKEv2
tunnel mode with NAT traversal enabled.
* A list of on-prem CIDRs that should be reachable from Constellation.

## Setup

1. Configure Cilium to route services for the VPN (see [Architecture](#architecture) for details).
* Edit the Cilium config: `kubectl -n kube-system edit configmap cilium-config`.
* Set the config item `enable-sctp: "true"`.
* Restart the Cilium agents: `kubectl -n kube-system rollout restart daemonset/cilium`.

2. Create the Constellation VPN configuration file.

```sh
helm inspect values . >config.yaml
```

2. Install the Helm chart.
3. Populate the Constellation VPN configuration file. At least the following
need to be configured:
* The list of on-prem CIDRs (`peerCIDRs`).
* The `ipsec` subsection.

4. Install the Helm chart.

```sh
helm install -f config.yaml vpn .
```

3. Follow the post-installation instructions displayed by the CLI.
5. Configure the on-prem gateway with Constellation's pod and service CIDR
(see `config.yaml`).

## Things to try

Ask CoreDNS about its own service IP:

```sh
dig +notcp @10.96.0.10 kube-dns.kube-system.svc.cluster.local
```

Ask the Kubernetes API server about its wellbeing:

```sh
curl --insecure https://10.96.0.1:6443/healthz
```

Ping a pod:

```sh
ping $(kubectl get pods vpn-frontend-0 -o go-template --template '{{ .status.podIP }}')
```

## Architecture

The VPN server is deployed as a `StatefulSet` to the cluster. It hosts the VPN frontend component, which is responsible for relaying traffic between the pod and the on-prem network, and the routing components that provide access to Constellation resources. The frontend supports IPSec and Wireguard.
The VPN server is deployed as a `StatefulSet` to the cluster. It hosts the VPN
frontend component, which is responsible for relaying traffic between the pod
and the on-prem network over an IPSec tunnel.

The VPN frontend is exposed with a public LoadBalancer so that it becomes
accessible from the on-prem network.

The VPN frontend is exposed with a public LoadBalancer to be accessible from the on-prem network. Traffic that reaches the VPN server pod is split into two categories: pod IPs and service IPs.
An init container sets up IP routes on the frontend host and inside the
frontend pod. All routes are bound to the frontend pod's lxc interface and thus
deleted together with it.

The pod IP range is NATed with an iptables rule. On-prem worklaods can establish connections to a pod IP, but the Constellation workloads will see the client IP translated to that of the VPN frontend pod.
A VPN operator deployment is added that configures the `CiliumEndpoint` with
on-prem IP ranges, thus configuring routes on non-frontend hosts. The endpoint
shares the frontend pod's lifecycle.

The service IP range is handed to a transparent proxy running in the VPN frontend pod, which relays the connection to a backend pod. This is necessary because of the load-balancing mechanism of Cilium, which assumes service IP traffic to originate from the Constellation cluster itself. As for pod IP ranges, Constellation pods will only see the translated client address.
In Cilium's default configuration, service endpoints are resolved in cgroup
eBPF hooks that are not applicable to VPN traffic. We force Cilium to apply
service NAT at the LXC interface by enabling SCTP support.

## Limitations

* Service IPs need to be proxied by the VPN frontend pod. This is a single point of failure, and it may become a bottleneck.
* IPs are NATed, so the Constellation pods won't see the real on-prem IPs.
* NetworkPolicy can't be applied selectively to the on-prem ranges.
* No connectivity from Constellation to on-prem workloads.
* VPN traffic is handled by a single pod, which may become a bottleneck.
* Frontend pod restarts / migrations invalidate IPSec connections.
* Only pre-shared key authentication is supported.
46 changes: 46 additions & 0 deletions dev-docs/howto/vpn/helm/files/operator/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/sh

signaled() {
exit 143
}

trap signaled INT TERM

all_ips() {
kubectl get pods "${VPN_FRONTEND_POD}" -o go-template --template '{{ range .status.podIPs }}{{ printf "%s " .ip }}{{ end }}'
echo "${VPN_PEER_CIDRS}"
}

cep_patch() {
for ip in $(all_ips); do printf '{"ipv4": "%s"}' "${ip}"; done | jq -s -c -j |
jq '[{op: "replace", path: "/status/networking/addressing", value: . }]'
}

# Format the space-separated CIDRs into a JSON array.
vpn_cidrs=$(for ip in ${VPN_PEER_CIDRS}; do printf '"%s" ' "${ip}"; done | jq -s -c -j)

masq_patch() {
kubectl -n kube-system get configmap ip-masq-agent -o json |
jq -r .data.config |
jq "{ masqLinkLocal: .masqLinkLocal, nonMasqueradeCIDRs: ((.nonMasqueradeCIDRs - ${vpn_cidrs}) + ${vpn_cidrs}) }" |
jq '@json | [{op: "replace", path: "/data/config", value: . }]'
}

reconcile_masq() {
if ! kubectl -n kube-system get configmap ip-masq-agent > /dev/null; then
# We don't know enough to create an ip-masq-agent.
return 0
fi

kubectl -n kube-system patch configmap ip-masq-agent --type json --patch "$(masq_patch)" > /dev/null
}

while true; do
# Reconcile CiliumEndpoint to advertise VPN CIDRs.
kubectl patch ciliumendpoint "${VPN_FRONTEND_POD}" --type json --patch "$(cep_patch)" > /dev/null

# Reconcile ip-masq-agent configuration to exclude VPN traffic.
reconcile_masq

sleep 10
done
44 changes: 44 additions & 0 deletions dev-docs/howto/vpn/helm/files/strongswan/sidecar.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/sh

set -u

if [ "$$" -eq "1" ]; then
echo 'This script must run in the root PID namespace, but $$ == 1!' >&2
exit 1
fi

myip() {
ip -j addr show eth0 | jq -r '.[0].addr_info[] | select(.family == "inet") | .local'
}

# Disable source IP verification on our network interface. Otherwise, VPN
# packets will be dropped by Cilium.
reconcile_sip_verification() {
# We want all of the cilium calls in this function to target the same
# process, so that we fail if the agent restarts in between. Thus, we only
# query the pid once per reconciliation.
cilium_agent=$(pidof cilium-agent) || return 0

cilium() {
nsenter -t "${cilium_agent}" -a -r -w cilium "$@"
}

myendpoint=$(cilium endpoint get "ipv4:$(myip)" | jq '.[0].id') || return 0

if [ "$(cilium endpoint config "${myendpoint}" -o json | jq -r .realized.options.SourceIPVerification)" = "Enabled" ]; then
cilium endpoint config "${myendpoint}" SourceIPVerification=Disabled
fi
}

# Set up the route from the node network namespace to the VPN pod.
reconcile_route() {
for cidr in ${VPN_PEER_CIDRS}; do
nsenter -t 1 -n ip route replace "${cidr}" via "$(myip)"
done
}

while true; do
reconcile_route
reconcile_sip_verification
sleep 10
done
38 changes: 0 additions & 38 deletions dev-docs/howto/vpn/helm/files/tproxy-setup.sh

This file was deleted.

13 changes: 0 additions & 13 deletions dev-docs/howto/vpn/helm/files/wireguard-setup.sh

This file was deleted.

2 changes: 2 additions & 0 deletions dev-docs/howto/vpn/helm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,6 @@ app.kubernetes.io/instance: {{ .Release.Name }}
value: {{ .Values.podCIDR | quote }}
- name: VPN_SERVICE_CIDR
value: {{ .Values.serviceCIDR | quote }}
- name: VPN_FRONTEND_POD
value: {{ include "..fullname" . }}-frontend-0
{{- end }}
16 changes: 2 additions & 14 deletions dev-docs/howto/vpn/helm/templates/configmaps.yaml
Original file line number Diff line number Diff line change
@@ -1,27 +1,15 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "..fullname" . }}-tproxy
name: {{ include "..fullname" . }}-operator
labels: {{- include "..labels" . | nindent 4 }}
data:
{{ (.Files.Glob "files/tproxy-setup.sh").AsConfig | indent 2 }}
{{ (.Files.Glob "files/operator/*").AsConfig | indent 2 }}
---
{{- if .Values.wireguard.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "..fullname" . }}-wg
labels: {{- include "..labels" . | nindent 4 }}
data:
{{ (.Files.Glob "files/wireguard-setup.sh").AsConfig | indent 2 }}
{{- end }}
---
{{ if .Values.ipsec.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "..fullname" . }}-strongswan
labels: {{- include "..labels" . | nindent 4 }}
data:
{{ (.Files.Glob "files/strongswan/*").AsConfig | indent 2 }}
{{- end }}
32 changes: 32 additions & 0 deletions dev-docs/howto/vpn/helm/templates/operator-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "..fullname" . }}-operator
labels: {{- include "..labels" . | nindent 4 }}
spec:
replicas: 1
selector:
matchLabels:
{{- include "..selectorLabels" . | nindent 6 }}
component: operator
template:
metadata:
labels:
{{- include "..selectorLabels" . | nindent 8 }}
component: operator
spec:
serviceAccountName: {{ include "..fullname" . }}
automountServiceAccountToken: true
containers:
- name: operator
image: {{ .Values.image | quote }}
command: ["sh", "/scripts/entrypoint.sh"]
env: {{- include "..commonEnv" . | nindent 10 }}
volumeMounts:
- name: scripts
mountPath: "/scripts"
readOnly: true
volumes:
- name: scripts
configMap:
name: {{ include "..fullname" . }}-operator
33 changes: 33 additions & 0 deletions dev-docs/howto/vpn/helm/templates/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "..fullname" . }}
automountServiceAccountToken: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "..fullname" . }}
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "patch"]
- apiGroups: ["cilium.io"]
resources: ["ciliumendpoints"]
verbs: ["get", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "..fullname" . }}
subjects:
- kind: ServiceAccount
name: {{ include "..fullname" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ include "..fullname" . }}
apiGroup: rbac.authorization.k8s.io
13 changes: 0 additions & 13 deletions dev-docs/howto/vpn/helm/templates/secrets.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,3 @@
{{- if .Values.wireguard.enabled }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "..fullname" . }}-wg
labels:
{{- include "..labels" . | nindent 4 }}
data:
wg.conf: {{ include "wireguard.conf" . | b64enc }}
{{- end }}
---
{{ if .Values.ipsec.enabled }}
apiVersion: v1
kind: Secret
metadata:
Expand All @@ -18,4 +6,3 @@ metadata:
{{- include "..labels" . | nindent 4 }}
data:
swanctl.conf: {{ include "strongswan.swanctl-conf" . | b64enc }}
{{- end }}
7 changes: 0 additions & 7 deletions dev-docs/howto/vpn/helm/templates/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,9 @@ spec:
component: frontend
externalTrafficPolicy: Local
ports:
{{- if .Values.ipsec.enabled }}
- name: isakmp
protocol: UDP
port: 500
- name: ipsec-nat-t
protocol: UDP
port: 4500
{{- end }}
{{- if .Values.wireguard.enabled }}
- name: wg
protocol: UDP
port: {{ .Values.wireguard.port }}
{{- end }}
Loading