-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dev-docs: add option to deploy a full L3 vpn
- Loading branch information
Showing
13 changed files
with
318 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Experimental Constellation VPN | ||
|
||
This variant of the Helm chart establishes full L3 connectivity between on-prem | ||
workloads and constellation pods. | ||
|
||
> **WARNING**: The experimental version of this Helm chart is, well, | ||
> experimental. It messes with the node configuration and has the | ||
> potential to break all networking. It's only tested on GCP, and only with | ||
> pre-release versions of Constellation, and even there it caused problems. | ||
> Use at your own risk! | ||
## Installation | ||
|
||
1. Choose one of the Constellation worker nodes and label it as the VPN node. | ||
|
||
```sh | ||
node=$(kubectl get nodes -l node-role.kubernetes.io/control-plane!="" -o jsonpath='{.items[0].metadata.name}') | ||
kubectl label nodes "$node" constellation.edgeless.systems/node-role=vpn | ||
``` | ||
|
||
1. Create and populate the configuration. Make sure to switch on `experimental.l3.enable`! | ||
|
||
```sh | ||
helm inspect values . >config.yaml | ||
``` | ||
|
||
2. Install the Helm chart. | ||
|
||
```sh | ||
helm install vpn . -f config.yaml | ||
``` | ||
|
||
3. Follow the post-installation instructions displayed by the CLI. | ||
|
||
## Architecture | ||
|
||
In addition to the NAT-based resources, the frontend contains an init container | ||
that sets up a networking bypass around Cilium. This is necessary to circumvent | ||
the restrictions that Cilium applies to pod traffic (source IP enforcement, for | ||
example). VPN traffic is routed directly to the host network, which in turn is | ||
modified to forward VPN traffic correctly to other pods. | ||
|
||
An artificial `CiliumEndpoint` is created to make Cilium aware of the on-prem | ||
IP ranges and route traffic from other nodes to the VPN node. There's also a | ||
`DaemonSet` that configures appropriate routes in the host network namespace of | ||
all cluster nodes. | ||
|
||
## Cleanup | ||
|
||
There's no built-in lifecycle management of the host network resources created | ||
by this chart. To remove the VPN configuration, first uninstall the Helm chart | ||
and then reboot all the nodes to start with a clean network. There's a button | ||
for this in the *Instance Group* view of GCP. | ||
|
||
## Limitations | ||
|
||
* Service IPs need to be proxied by the VPN frontend pod. This is a single | ||
point of failure, and it may become a bottleneck. | ||
* The VPN is bound to a single node, which is another single point of failure. | ||
* Interaction between VPN and NetworkPolicy is not fully explored yet, and may | ||
have surprising consequences. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
15 changes: 15 additions & 0 deletions
15
dev-docs/howto/vpn/files/routing/experimental/all-nodes.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#!/bin/sh | ||
|
||
cleanup() { | ||
for cidr in ${VPN_PEER_CIDRS}; do | ||
ip route delete "${cidr}" || true | ||
done | ||
} | ||
trap cleanup INT TERM | ||
|
||
while true; do | ||
for cidr in ${VPN_PEER_CIDRS}; do | ||
ip route replace "${cidr}" dev cilium_wg0 | ||
done | ||
sleep 10 | ||
done |
71 changes: 71 additions & 0 deletions
71
dev-docs/howto/vpn/files/routing/experimental/frontend-host.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
#!/bin/sh | ||
|
||
set -eu | ||
|
||
# TODO: The bridge needs to be cleaned up if the pod migrates to another host! | ||
|
||
# Bypass the lxc* interface so that VPN packets are not subject to Cilium's | ||
# pod restrictions. | ||
|
||
ip link delete vpn_br || true | ||
ip link add vpn_br type bridge | ||
ip link set dev vpn_lower master vpn_br | ||
ip address add 169.254.42.1 dev vpn_br scope link | ||
ip link set dev vpn_br up | ||
ip link set dev vpn_lower up | ||
ip route replace 169.254.42.2 dev vpn_br | ||
|
||
# Traffic from local pods or other nodes to the VPN is fwmarked and sent to | ||
# the VPN frontend pod. | ||
|
||
ip route replace default via 169.254.42.2 dev vpn_br table 41 | ||
ip rule add fwmark 0x2/0x2 table 41 priority 41 || true | ||
|
||
iptables -t mangle -N VPN_PRE || iptables -t mangle -F VPN_PRE | ||
for cidr in ${VPN_PEER_CIDRS}; do | ||
iptables -t mangle -A VPN_PRE -i lxc+ -d "${cidr}" -j MARK --set-mark 2 | ||
iptables -t mangle -A VPN_PRE -i cilium_wg0 -d "${cidr}" -j MARK --set-mark 2 | ||
done | ||
|
||
iptables -t mangle -C PREROUTING -j VPN_PRE || iptables -t mangle -I PREROUTING -j VPN_PRE | ||
|
||
# Cilium does NAT if the destination IP appears to be outside the cluster, | ||
# which would affect VPN traffic, too. We register a rule that skips Cilium | ||
# NAT for traffic to the VPN. | ||
|
||
iptables -t nat -N VPN_POST || iptables -t nat -F VPN_POST | ||
for cidr in ${VPN_PEER_CIDRS}; do | ||
iptables -t nat -I VPN_POST -d "${cidr}" -j ACCEPT | ||
done | ||
iptables -t nat -C POSTROUTING -j VPN_POST || iptables -t nat -I POSTROUTING -j VPN_POST | ||
|
||
# Now we tell the host how to deal with traffic from the VPN to the pod | ||
# network. The rule that we're crafting below is: | ||
# Send everything from the VPN to the pod network over Cilium's Wireguard | ||
# tunnel, unless the traffic is for a local pod. | ||
# This turns out to be a bit tricky to model, because we need to split the | ||
# traffic between lcx+ and cilium_wg0. Cilium configures the host-local routes | ||
# in the host network namespace, and we would like to reuse these, so we can't | ||
# create a separate routing table for fw-marked VPN packets. | ||
# For some reason, policy-based routing on the reroute-check after FORWARD | ||
# does not work here. But we can turn the logic around and try to mark packets | ||
# that *don't* come from the VPN. | ||
|
||
# Tell the node to send all packets for pod IP ranges over cilium_wg0. | ||
# Local pods match the more specific routes. | ||
ip route replace "${VPN_POD_CIDR}" dev cilium_wg0 | ||
|
||
# Create a routing table that shows marked traffic the default route (i.e. the | ||
# physical interface). | ||
# Word splitting is intended here. | ||
# shellcheck disable=SC2046 | ||
ip route replace $(ip route show default) table 44 | ||
ip rule add fwmark 0x4/0x4 table 44 priority 44 || true | ||
|
||
# Traffic that should go to the physical interface: locally created packets | ||
# with source IP outside the pod CIDR and non-local destination. | ||
|
||
iptables -t mangle -N VPN_OUTPUT || iptables -t mangle -F VPN_OUTPUT | ||
iptables -t mangle -A VPN_OUTPUT -o lxc+ -j RETURN | ||
iptables -t mangle -A VPN_OUTPUT ! -s "${VPN_POD_CIDR}" -d "${VPN_POD_CIDR}" -j MARK --set-mark 0x4/0x4 | ||
iptables -t mangle -C OUTPUT -j VPN_OUTPUT || iptables -t mangle -I OUTPUT -j VPN_OUTPUT |
28 changes: 28 additions & 0 deletions
28
dev-docs/howto/vpn/files/routing/experimental/frontend-pod.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
#!/bin/sh | ||
|
||
set -eu | ||
|
||
if [ "$$" -eq "1" ]; then | ||
echo 'This script must run in the root PID namespace, but $$ == 1!' 2> /dev/null | ||
fi | ||
|
||
ip netns attach root 1 | ||
|
||
ip link delete vpn_upper || true | ||
ip link delete vpn_lower || true | ||
ip netns exec root ip link delete vpn_lower || true | ||
|
||
ip link add vpn_upper type veth peer name vpn_lower | ||
ip link set dev vpn_lower netns root | ||
ip address add 169.254.42.2 dev vpn_upper scope link | ||
ip link set dev vpn_upper up | ||
|
||
echo "Meanwhile, in the root network namespace ..." >&2 | ||
|
||
ip netns exec root sh -x frontend-host.sh | ||
|
||
echo "Back in the pod network namespace ..." >&2 | ||
|
||
ip route replace 169.254.42.1 dev vpn_upper | ||
ip route replace default via 169.254.42.1 dev vpn_upper table 41 | ||
ip rule add to "${VPN_POD_CIDR}" iif vpn_wg0 table 41 priority 41 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/sh | ||
|
||
set -eu | ||
|
||
iptables -t nat -N VPN_POST || iptables -t nat -F VPN_POST | ||
|
||
for cidr in ${VPN_PEER_CIDRS}; do | ||
iptables -t nat -A VPN_POST -s "${cidr}" -d "${VPN_POD_CIDR}" -j MASQUERADE | ||
done | ||
|
||
iptables -t nat -C POSTROUTING -j VPN_POST || iptables -t nat -A POSTROUTING -j VPN_POST |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,14 @@ | ||
{{- if .Values.ipsec.enabled }} | ||
Required postinstallation steps (also see README.md): | ||
|
||
# Configure the LoadBalancer | ||
# Patch the CiliumEndpoint | ||
|
||
kubectl patch cep {{ include "..fullname" . }}-routes --type='json' \ | ||
-p='[{"op": "replace", "path": "/status/networking/node", "value":"'$(kubectl get pods {{ include "..fullname" . }}-frontend-0 -o jsonpath={.status.hostIP})'"}]' | ||
|
||
{{- if .Values.ipsec.enabled }} | ||
# Patch the LoadBalancer | ||
|
||
1. Find the node hosting the VPN server: | ||
kubectl get pods {{ include "..fullname" . }}-frontend-0 -o jsonpath={.spec.nodeName} | ||
2. Edit the load balancer resource in GCP and remove all other endpoints. | ||
{{- end }} | ||
2. Edit the load balancer in the cloud and remove all other endpoints. | ||
{{- end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
{{ if .Values.experimental.l3.enable }} | ||
apiVersion: cilium.io/v2 | ||
kind: CiliumEndpoint | ||
metadata: | ||
name: {{ include "..fullname" . }}-routes | ||
status: | ||
encryption: {} | ||
id: 0 | ||
identity: | ||
id: 0 | ||
networking: | ||
addressing: | ||
{{- range .Values.peerCIDRs }} | ||
- ipv4: {{ . }} | ||
{{- end }} | ||
node: "" | ||
policy: | ||
egress: | ||
enforcing: false | ||
state: disabled | ||
ingress: | ||
enforcing: false | ||
state: disabled | ||
state: ready | ||
visibility-policy-status: disabled | ||
{{- end }} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
{{ if .Values.experimental.l3.enable }} | ||
apiVersion: apps/v1 | ||
kind: DaemonSet | ||
metadata: | ||
name: {{ include "..fullname" . }}-routes | ||
labels: {{- include "..labels" . | nindent 4 }} | ||
spec: | ||
selector: | ||
matchLabels: | ||
{{- include "..selectorLabels" . | nindent 6 }} | ||
component: routes | ||
template: | ||
metadata: | ||
labels: | ||
{{- include "..selectorLabels" . | nindent 8 }} | ||
component: routes | ||
spec: | ||
hostNetwork: true | ||
tolerations: | ||
- key: "node-role.kubernetes.io/control-plane" | ||
operator: "Exists" | ||
effect: "NoSchedule" | ||
containers: | ||
- name: route | ||
image: "nixery.dev/shell/iproute2" | ||
securityContext: | ||
capabilities: | ||
add: ["NET_ADMIN"] | ||
command: ["/bin/sh", "/entrypoint.sh"] | ||
env: {{- include "..commonEnv" . | nindent 10 }} | ||
volumeMounts: | ||
- name: routes | ||
mountPath: "/entrypoint.sh" | ||
subPath: "all-nodes.sh" | ||
readOnly: true | ||
volumes: | ||
- name: routes | ||
configMap: | ||
name: {{ include "..fullname" . }}-routes | ||
{{- end }} |
Oops, something went wrong.