-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: route service mesh egress over l7 #143
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -99,8 +99,8 @@ is not secured. | |||||
* Which traffic do we want to secure? HTTP/S, TCP, UDP, ICMP? Is TLS even the | ||||||
correct layer for this? | ||||||
|
||||||
Since TCP service meshes are ubiquitously used, only supporting TCP for now is | ||||||
fine. | ||||||
Since HTTP service meshes are ubiquitously used, only supporting HTTP for now is | ||||||
fine. Note that supporting HTTP also supports gRPC since it uses HTTP/2. | ||||||
|
||||||
* Do we allow workloads to talk to the internet by default? Otherwise we can | ||||||
wrap all egress traffic in mTLS. | ||||||
|
@@ -122,17 +122,10 @@ setup and configure Envoy. | |||||
|
||||||
### Step 1: Egress | ||||||
|
||||||
The routing works on layer 3. The workload owner configures the workload's | ||||||
service endpoints to point to a unique local IP out of the 127.0.0.1/8 CIDR. | ||||||
The workload owner configures the proxy to listen on each of those addresses and | ||||||
map it to a remote service domain. | ||||||
|
||||||
If possible, we don't want to touch the port of the packets so that we can | ||||||
transparently proxy all ports of a service. | ||||||
|
||||||
Note that this is not secure by default. If the user doesn't configure the | ||||||
endpoints in their application, traffic is send out unencrypted and without | ||||||
authentication. | ||||||
The egress proxing works on Layer 7. All of the workload's TCP traffic is | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please describe what the benefit of going to L7 is. I believe the motivation is touched in the PR description, but I don't quite understand why we'd want to choose this tradeoff. |
||||||
redirected via tproxy iptable rules to Envoy. By default, all traffic is | ||||||
wrapped inside TLS. The user can provide an allowlist for endpoints to just | ||||||
transparently forward. | ||||||
Comment on lines
+126
to
+128
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the allowlist also restricted to HTTP? If not, how do exemptions work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think in the first step it might not even be implemented at all. Then we can have only http endpoints but later we can also have arbitrary entries in the allow list (domain names, IPs, http endpoints) For this, we would prefix the allowlist entry with the "layer" like |
||||||
|
||||||
<img src="./assets/001-egress.svg"> | ||||||
|
||||||
|
@@ -148,31 +141,13 @@ Envoy. Also traffic originating from the uid the proxy is started with, is not | |||||
redirected. Since by default all traffic is routed to Envoy, the workload's | ||||||
ingress endpoint are secure by default. | ||||||
|
||||||
<img src="./assets/001-ingress.svg"> | ||||||
|
||||||
### Step 3: Secure by default egress | ||||||
|
||||||
Ideally, we also want to also have secure by default egress. But this comes with | ||||||
additional tradeoffs. If we assume that the workload does _NOT_ talk to any | ||||||
other endpoints outside of the service mesh, then we can redirect all traffic | ||||||
through the proxy. Since we cannot assume this to be true for all workloads, | ||||||
we still need the explicit configuration method described above. | ||||||
|
||||||
Since we need to allow DNS for Kubernetes service lookups, we can only redirect | ||||||
all TCP traffic via the proxy. | ||||||
Comment on lines
-153
to
-162
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't the proposal here step 3, but only for HTTP? Why not switch to step 3 right away then, and implement step 2 later? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I always thought of this point that it would be implemented by hijacking DNS, but this is not mentioned on the step so this exactly fits. |
||||||
|
||||||
### Optional: Egress capturing via DNS | ||||||
[1] <https://github.com/istio/istio/wiki/Understanding-IPTables-snapshot> | ||||||
|
||||||
If we want to allow additional endpoints, we also need to touch the pod's | ||||||
DNS resolution. An easy way would be to resolve the allowlisted entries to | ||||||
either directly the correct endpoint or to a special ip of the proxy. | ||||||
This required the application to not implement basic DNS (over UDP) and not | ||||||
DNS-over-HTTPS, DNS-over-QUIC, or similar. | ||||||
<img src="./assets/001-ingress.svg"> | ||||||
|
||||||
### Outlook | ||||||
|
||||||
Especially for ingress but also for egress as described in step 3, | ||||||
we must ensure that the sidecar/init container runs | ||||||
We must ensure that the sidecar/init container runs | ||||||
before the workloads receives traffic. Otherwise, it might be that the iptable | ||||||
rules are not configured yet and the traffic is send without TLS and without | ||||||
client verification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can we just swap out protocols here and the sentence remains true? I think there are enough TCP-based protocols that are neither HTTP nor encrypted, which all the other meshes support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming from a Go/K8s world I think this sentence remains true since gRPC is the default way any Go application does things in K8s. I'm not as sure about other languages though.
Can you give examples of protocols typically used in K8s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people choose non-HTTP protocols over TCP when they want to avoid parsing, for example in latency sensitive or memory restricted applications. From the top of my head:
While gRPC may often be a good fit for this type (e.g. Arrow), I guess it's just too new to have significant market share.
Furthermore, I'd say that the lift-and-shift promise is most interesting to users with either proprietary or ancient workloads that are hard to secure - DICOM was mentioned in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that is a very good point. Then I'll close this PR and go with the steps in the original PR.
I also thought about maybe using this L7 routing proposed here for Step 3, but I think the DNS way is better since is also solves the use-cases mentioned by you.