rfc: workload secrets #777

burgerdev · 2024-07-31T10:49:38Z

No description provided.

Freax13 · 2024-07-31T13:45:50Z

rfc/007-workload-secrets.md

+### Secret source
+
+Workload secrets would need to be either persisted or deterministically derived.
+The only secure persistence option for the Coordinator is encryption with a key derived from the secret seed.


We could also get a key from the hardware, though there are a lot of limitations:

Currently, AFAIK TDX doesn't have a way to derive a key, though there might be one in the future (the documentation contains mentions of TDG.MR.KEY.GET, but doesn't document it).

AMD SEV-SNP has a way to derive a key, but ...

the key is tied to the host data.

the key is tied to a VCEK or VLEK.

Still, we could use this to locally cache the secret seed in a secure way: If either the VCEK or the VLEK are fixed (VCEK should be static on bare-metal deployments, VLEK should be static for supporting CSPs i.e AWS), we could ask the hardware the generate a key, use that key to encrypt the secret seed and save the ciphertext on the node. When the coordinator gets restarted, it could ask the hardware to generate the same key again, use that to decrypt the secret seed and do recovery on its own.
Obviously this would break on updating the deployment, but I'd say that there's still value in allowing a coordinator to recover without a human present, especially so, if the coordinator downtime wasn't planned to begin with and the coordinator restart was spurious.

I like the idea of recovering the coordinator with hardware-sealed keys! Do you think such a mechanism should be used for workload secrets, too, or would it be sufficient to recover the Coordinator seed and derive the workload key?

I like it, but that should be part of another RFC. ;) Switching the source afterwards shouldn't impact the design much.

Do you think such a mechanism should be used for workload secrets, too, or would it be sufficient to recover the Coordinator seed and derive the workload key?

I didn't think about storing workload secrets, that's an interesting idea! I'm not quite sure what the differences are in practice.

rfc/007-workload-secrets.md

m1ghtym0

Another great RFC, thanks!

m1ghtym0 · 2024-07-31T15:10:48Z

rfc/007-workload-secrets.md

+
+Due to being derived from the secret seed, the workload key is stable across manifest updates.
+This implies that no migration is needed, as long as the `WorkloadSecretID` doesn't change.
+On the other hand, this means that the workload secret can be obtained by the workload owner, and should be treated accordingly.


Definitely something we should keep in mind when documenting this.

thomasten · 2024-07-31T15:12:56Z

rfc/007-workload-secrets.md

+Instead of deriving the identity from workload characteristics, users are going to label workloads with an identity.
+To allow that, we're' going to change the `Manifest` schema.
+The policies field of the manifest is currently a plain map of policy hashes to SANs.
+
+```json
+{
+    "Policies": {
+        "99dd77cbd7fe2c4e1f29511014c14054a21a376f7d58a48d50e9e036f4522f6b": [
+            "web",
+            "*",
+            "203.0.113.34"
+        ]
+    }
+}
+```
+
+Changing this to an object with named fields allows adding new metadata, for example a `WorkloadSecretID`.
+
+```json
+{
+    "Policies": {
+        "99dd77cbd7fe2c4e1f29511014c14054a21a376f7d58a48d50e9e036f4522f6b": {
+            "SAN": ["web", "*", "203.0.113.34"],
+            "WorkloadSecretID": "openbao-prod",
+            // ...
+        }
+    }
+}
+```
+
+### Implementation
+
+After successful workload verification, the Coordinator derives a key with HKDF, setting the info argument to `workload-key:$WorkloadSecretID`.
+This key is returned as a new field `WorkloadKey` in the `NewMeshCertResponse`.
+The initializer writes the key to a file in the shared tmpfs, from where it can be used by the workload.


This part is similar to MarbleRun. In addition, MarbleRun allows to inject the secret at different places and in different formats. With this, you can often avoid adapting your workload. If we would want something similar for Contrast, I guess it should be build upon the primitive you introduce with this RFC, right? For example, one could create an init container that transforms the secret such that the actual workload wouldn't need to be modified. Do you think this works or is there anything we'd need to consider for this RFC to enable this use case?

I guess it's a good sign that the proposals converge, even though I did not peek at the MarbleRun implementation :)

The scheme proposed here is intentionally designed to support derivation of more keys, and these use cases should be enabled by it.

The initializer (and other init containers) is somewhat limited in terms of what it can inject. Shared files are definitely possible, but env vars are not. I think the annotation pattern we use for the service mesh would be a good starting point for this.

katexochen

lgtm, thanks!

Implementation of 1) secret derivation and distribution and 2) workload identity in manifest should be possible in parallel.
Will the design of a guest-component for disk encryption be part of another RFC?

3u13r

LGTM

rfc/007-workload-secrets.md

burgerdev · 2024-08-05T12:35:23Z

Will the design of a guest-component for disk encryption be part of another RFC?

I believe it should be in a separate RFC, because delivering a secret seed is useful on its own (see use case). By the way, the capabilities of CDH are evolving rapidly and might be worth taking a look at first.

burgerdev requested review from m1ghtym0, Freax13, 3u13r, katexochen, thomasten and msanft July 31, 2024 10:49

burgerdev added the no changelog PRs not listed in the release notes label Jul 31, 2024

Freax13 reviewed Jul 31, 2024

View reviewed changes

m1ghtym0 approved these changes Jul 31, 2024

View reviewed changes

thomasten reviewed Jul 31, 2024

View reviewed changes

katexochen approved these changes Aug 1, 2024

View reviewed changes

thomasten approved these changes Aug 1, 2024

View reviewed changes

3u13r approved these changes Aug 1, 2024

View reviewed changes

rfc/007-workload-secrets.md Outdated Show resolved Hide resolved

rfc: workload secrets

f840726

burgerdev force-pushed the rfc/007 branch from a831b20 to f840726 Compare August 5, 2024 12:28

burgerdev merged commit 8b742d0 into main Aug 5, 2024
6 checks passed

burgerdev deleted the rfc/007 branch August 5, 2024 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc: workload secrets #777

rfc: workload secrets #777

burgerdev commented Jul 31, 2024

Freax13 Jul 31, 2024

burgerdev Jul 31, 2024

katexochen Aug 1, 2024

Freax13 Aug 1, 2024

m1ghtym0 left a comment

m1ghtym0 Jul 31, 2024

thomasten Jul 31, 2024

burgerdev Jul 31, 2024

katexochen left a comment

3u13r left a comment

burgerdev commented Aug 5, 2024

rfc: workload secrets #777

rfc: workload secrets #777

Conversation

burgerdev commented Jul 31, 2024

Freax13 Jul 31, 2024

Choose a reason for hiding this comment

burgerdev Jul 31, 2024

Choose a reason for hiding this comment

katexochen Aug 1, 2024

Choose a reason for hiding this comment

Freax13 Aug 1, 2024

Choose a reason for hiding this comment

m1ghtym0 left a comment

Choose a reason for hiding this comment

m1ghtym0 Jul 31, 2024

Choose a reason for hiding this comment

thomasten Jul 31, 2024

Choose a reason for hiding this comment

burgerdev Jul 31, 2024

Choose a reason for hiding this comment

katexochen left a comment

Choose a reason for hiding this comment

3u13r left a comment

Choose a reason for hiding this comment

burgerdev commented Aug 5, 2024