Skip to content

Commit

Permalink
dev-docs: full L3 connectivity in VPN chart
Browse files Browse the repository at this point in the history
  • Loading branch information
burgerdev committed Jan 5, 2024
1 parent 7a06430 commit 50796be
Show file tree
Hide file tree
Showing 15 changed files with 296 additions and 100 deletions.
59 changes: 47 additions & 12 deletions dev-docs/howto/vpn/helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,40 @@

This Helm chart deploys a VPN server to your Constellation cluster.

## Installation
## Prerequisites

1. Create and populate the configuration.
* Constellation >= v2.14.0
* A publicly routable VPN endpoint on premises that supports any of
* IPSec in IKEv2 tunnel mode with NAT traversal enabled
* Wireguard
* A list of on-prem CIDRs that should be reachable from Constellation.

## Setup

1. Configure Cilium to route services for the VPN (see [Architecture](#architecture) for details).
* Edit the Cilium config: `kubectl -n kube-system edit configmap cilium-config`.
* Set the config item `enable-sctp: "true"`.
* Restart the Cilium agents: `kubectl -n kube-system rollout restart daemonset/cilium`.

2. Create the Constellation VPN configuration file.

```sh
helm inspect values . >config.yaml
```

2. Install the Helm chart.
3. Populate the Constellation VPN configuration file. At least the following
need to be configured:
* The list of on-prem CIDRs (`peerCIDRs`).
* One of the VPN protocol subsections (`wireguard` or `ipsec`).

4. Install the Helm chart.

```sh
helm install -f config.yaml vpn .
```

3. Follow the post-installation instructions displayed by the CLI.
5. Configure the on-prem gateway with Constellation's pod and service CIDR
(see `config.yaml`).

## Things to try

Expand All @@ -32,19 +51,35 @@ Ask the Kubernetes API server about its wellbeing:
curl --insecure https://10.96.0.1:6443/healthz
```

Ping a pod:

```sh
ping $(kubectl get pods vpn-frontend-0 -o go-template --template '{{ .status.podIP }}')
```

## Architecture

The VPN server is deployed as a `StatefulSet` to the cluster. It hosts the VPN frontend component, which is responsible for relaying traffic between the pod and the on-prem network, and the routing components that provide access to Constellation resources. The frontend supports IPSec and Wireguard.
The VPN server is deployed as a `StatefulSet` to the cluster. It hosts the VPN
frontend component, which is responsible for relaying traffic between the pod
and the on-prem network. The frontend supports IPSec and Wireguard.

The VPN frontend is exposed with a public LoadBalancer so that it becomes
accessible from the on-prem network.

The VPN frontend is exposed with a public LoadBalancer to be accessible from the on-prem network. Traffic that reaches the VPN server pod is split into two categories: pod IPs and service IPs.
An init container sets up IP routes on the frontend host and inside the
frontend pod. All routes are bound to the frontend pod's lxc interface and thus
deleted together with it.

The pod IP range is NATed with an iptables rule. On-prem worklaods can establish connections to a pod IP, but the Constellation workloads will see the client IP translated to that of the VPN frontend pod.
A VPN operator deployment is added that configures the `CiliumEndpoint` with
on-prem IP ranges, thus configuring routes on non-frontend hosts. The endpoint
shares the frontend pod's lifecycle.

The service IP range is handed to a transparent proxy running in the VPN frontend pod, which relays the connection to a backend pod. This is necessary because of the load-balancing mechanism of Cilium, which assumes service IP traffic to originate from the Constellation cluster itself. As for pod IP ranges, Constellation pods will only see the translated client address.
In Cilium's default configuration, service endpoints are resolved in cgroup
eBPF hooks that are not applicable to VPN traffic. We force Cilium to apply
service NAT at the LXC interface by enabling SCTP support.

## Limitations

* Service IPs need to be proxied by the VPN frontend pod. This is a single point of failure, and it may become a bottleneck.
* IPs are NATed, so the Constellation pods won't see the real on-prem IPs.
* NetworkPolicy can't be applied selectively to the on-prem ranges.
* No connectivity from Constellation to on-prem workloads.
* VPN traffic is handled by a single pod, which may become a bottleneck.
* Frontend pod restarts / migrations invalidate IPSec connections. Wireguard
should be able to handle restarts somewhat gracefully.
44 changes: 44 additions & 0 deletions dev-docs/howto/vpn/helm/files/routing/operator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/sh

# TODO: this needs to be determined from Helm values!
vpn_frontend=vpn-frontend-0

all_ips() {
kubectl get pods "${vpn_frontend}" -o go-template --template '{{ range .status.podIPs }}{{ printf "%s " .ip }}{{ end }}'
echo "${VPN_PEER_CIDRS}"
}

cep_patch() {
printf '[{"op": "replace", "path": "/status/networking/addressing", "value": '
for ip in $(all_ips); do printf '{"ipv4": "%s"}' "${ip}"; done | jq -s -c -j
echo '}]'
}

# Format the space-separated CIDRs into a JSON array.
vpn_cidrs=$(for ip in ${VPN_PEER_CIDRS}; do printf '"%s" ' "${ip}"; done | jq -s -c -j)

masq_patch() {
kubectl -n kube-system get configmap ip-masq-agent -o json |
jq -r .data.config |
jq "{ masqLinkLocal: .masqLinkLocal, nonMasqueradeCIDRs: ((.nonMasqueradeCIDRs - ${vpn_cidrs}) + ${vpn_cidrs}) }" |
jq '@json | [{op: "replace", path: "/data/config", value: . }]'
}

reconcile_masq() {
if ! kubectl -n kube-system get configmap ip-masq-agent > /dev/null; then
# We don't know enough to create an ip-masq-agent.
return 0
fi

kubectl -n kube-system patch configmap ip-masq-agent --type json --patch "$(masq_patch)" > /dev/null
}

while true; do
# Reconcile CiliumEndpoint to advertise VPN CIDRs.
kubectl patch ciliumendpoint "${vpn_frontend}" --type json --patch "$(cep_patch)" > /dev/null

# Reconcile ip-masq-agent configuration to exclude VPN traffic.
reconcile_masq

sleep 10
done
30 changes: 30 additions & 0 deletions dev-docs/howto/vpn/helm/files/routing/pod-l3-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/sh

set -eu

if [ "$$" -eq "1" ]; then
echo 'This script must run in the root PID namespace, but $$ == 1!' >&2
exit 1
fi

# Set up routes for VPN traffic. Inside our netns, point to the VPN interface.
# In the host network namespace, point to the pod interface.

for cidr in ${VPN_PEER_CIDRS}; do
ip route replace "${cidr}" dev "${VPN_INTERFACE}"
done

rm -f /var/run/netns/root
ip netns attach root 1

ip_root() {
ip netns exec root ip "$@"
}

lower_interface_id=$(ip -j l show eth0 | jq '.[0].link_index')
lower_interface=$(ip_root -j link show | jq -r ".[] | select(.ifindex == ${lower_interface_id}) | .ifname")

myip=$(ip -j addr show eth0 | jq -r '.[0].addr_info[] | select(.family == "inet") | .local')
for cidr in ${VPN_PEER_CIDRS}; do
ip_root route replace "${cidr}" via "${myip}" dev "${lower_interface}"
done
29 changes: 29 additions & 0 deletions dev-docs/howto/vpn/helm/files/routing/sidecar.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/sh

# Disable source IP verification on our network interface. Otherwise, VPN
# packets will be dropped by Cilium.

reconcile_sip_verification() {

# Disable source IP verification on our network interface. Otherwise, VPN
# packets will be dropped by Cilium.

cilium_agent=$(pidof cilium-agent)
myip=$(ip -j addr show eth0 | jq -r '.[0].addr_info[] | select(.family == "inet") | .local')

cilium() {
nsenter -t "${cilium_agent}" -a -r -w cilium "$@"
}

myendpoint=$(cilium endpoint get "ipv4:${myip}" | jq '.[0].id')

if [ "$(cilium endpoint config "${myendpoint}" -o json | jq -r .realized.options.SourceIPVerification)" = "Enabled" ]; then
cilium endpoint config "${myendpoint}" SourceIPVerification=Disabled
fi

}

while true; do
reconcile_sip_verification
sleep 10
done
6 changes: 6 additions & 0 deletions dev-docs/howto/vpn/helm/files/strongswan/strongswan-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/sh

set -eu

ip link add dev "${VPN_INTERFACE}" type xfrm dev eth0 if_id 0xfe
ip link set dev "${VPN_INTERFACE}" up
38 changes: 0 additions & 38 deletions dev-docs/howto/vpn/helm/files/tproxy-setup.sh

This file was deleted.

12 changes: 3 additions & 9 deletions dev-docs/howto/vpn/helm/files/wireguard-setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,6 @@

set -eu

dev=vpn_wg0

ip link add dev "${dev}" type wireguard
wg setconf "${dev}" /etc/wireguard/wg.conf
ip link set dev "${dev}" up

for cidr in ${VPN_PEER_CIDRS}; do
ip route replace "${cidr}" dev "${dev}"
done
ip link add dev "${VPN_INTERFACE}" type wireguard
wg setconf "${VPN_INTERFACE}" /etc/wireguard/wg.conf
ip link set dev "${VPN_INTERFACE}" up
2 changes: 2 additions & 0 deletions dev-docs/howto/vpn/helm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,6 @@ app.kubernetes.io/instance: {{ .Release.Name }}
value: {{ .Values.podCIDR | quote }}
- name: VPN_SERVICE_CIDR
value: {{ .Values.serviceCIDR | quote }}
- name: VPN_INTERFACE
value: vpn0
{{- end }}
8 changes: 4 additions & 4 deletions dev-docs/howto/vpn/helm/templates/configmaps.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "..fullname" . }}-tproxy
name: {{ include "..fullname" . }}-scripts
labels: {{- include "..labels" . | nindent 4 }}
data:
{{ (.Files.Glob "files/tproxy-setup.sh").AsConfig | indent 2 }}
---
{{ (.Files.Glob "files/routing/*.sh").AsConfig | indent 2 }}
{{- if .Values.wireguard.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
Expand All @@ -15,8 +15,8 @@ metadata:
data:
{{ (.Files.Glob "files/wireguard-setup.sh").AsConfig | indent 2 }}
{{- end }}
{{- if .Values.ipsec.enabled }}
---
{{ if .Values.ipsec.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down
32 changes: 32 additions & 0 deletions dev-docs/howto/vpn/helm/templates/operator-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "..fullname" . }}-operator
labels: {{- include "..labels" . | nindent 4 }}
spec:
replicas: 1
selector:
matchLabels:
{{- include "..selectorLabels" . | nindent 6 }}
component: operator
template:
metadata:
labels:
{{- include "..selectorLabels" . | nindent 8 }}
component: operator
spec:
serviceAccountName: {{ include "..fullname" . }}
automountServiceAccountToken: true
containers:
- name: operator
image: {{ .Values.image | quote }}
command: ["/bin/sh", "/scripts/operator.sh"]
env: {{- include "..commonEnv" . | nindent 10 }}
volumeMounts:
- name: scripts
mountPath: "/scripts"
readOnly: true
volumes:
- name: scripts
configMap:
name: {{ include "..fullname" . }}-scripts
33 changes: 33 additions & 0 deletions dev-docs/howto/vpn/helm/templates/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "..fullname" . }}
automountServiceAccountToken: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "..fullname" . }}
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "patch"]
- apiGroups: ["cilium.io"]
resources: ["ciliumendpoints"]
verbs: ["get", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "..fullname" . }}
subjects:
- kind: ServiceAccount
name: {{ include "..fullname" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ include "..fullname" . }}
apiGroup: rbac.authorization.k8s.io
2 changes: 2 additions & 0 deletions dev-docs/howto/vpn/helm/templates/strongswan-secret.tpl
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{{- define "strongswan.swanctl-conf" }}
connections {
net-net {
if_id_in = 0xfe
if_id_out = 0xfe
remote_addrs = {{ .Values.ipsec.peer }}
local {
auth = psk
Expand Down
Loading

0 comments on commit 50796be

Please sign in to comment.