Private Keys for Agent ↔ Tenant association during registration #4

jhunt · 2019-09-07T13:02:31Z

@thomasmitchell raised this concern in a discussion we had offline; moving it here so that we can discuss.

Prospective agents should not identify the tenant they wish to be owned by through tenant ID, as this has been considered a fairly public token. There should be a new private agent token provisioned to identify the tenant.

Consider a system where keys are provisioned ahead of time by the SHIELD core for authentication, per agent, by a tenant. This is simpler than a whitelist system, provides a source of unique identification regardless of naming, and allows for easy transfer of tenant ownership. The agent then does not need to generate its own key.

Downsides are:

the core must be deployed before all agents

runtime config agent deployments become impossible

randomly provisioned kubernetes agents become difficult (but they already are cumbersome through the proposed system).

jhunt · 2019-09-07T14:17:16Z

Here's one way this approach might work (WF1):

Deploy SHIELD
Create a Tenant; as that Tenant:
Fill out the "Provision New Agent" form (SHIELD will generate a keypair)
Download the private key for the Agent
Deploy the Agent with the private key

This is the most convenient workflow, but it does mean that the SHIELD Core is in charge of the key material for some (albeit small) window of time.

A more rigorously secure workflow might look like this (WF2):

Deploy SHIELD
Generate an RSA keypair offline
Create a Tenant; as that Tenant:
Fill out the "Provision New Agent" form (providing the public key)
Deploy the Agent with the private key

(these workflows differ only in who generates the keypair, and when).

For comparison, here is the workflow we are currently proposing (WFP):

Deploy SHIELD
Create a Tenant (to get its UUID)
Generate an RSA keypair offline
Deploy the Agent with the private key
Approve the agent registration in the SHIELD Core (for the Tenant)

aside: I don't personally think WFP qualifies as "cumbersome"

These three workflows all look fairly equal in terms of steps, involved systems, and level of complexity, so the base case (everything is manual) is identical. Let's consider the more automated deployment scenarios of BOSH and Kubernetes.

BOSH

Here is what the proposed workflow, WFP, looks like when deploying under BOSH:

Deploy SHIELD
Create a Tenant (to get its UUID)
Deploy Agent (and data system, presumably) via BOSH, using -v tenant_uuid=$x and leveraging CredHub to create the RSA keypair
Approve the agent registration in the SHIELD Core (for the Tenant)

Notably, the handling of key material is 100% taken care of by automated systems, and the operator can remain unaware of the existence of the key. If they want to, they can retrieve the public key from CredHub, post-deployment, for visual verification in the approval screens of the SHIELD Web UI (or CLI).

Here is what WF2 looks like under BOSH (WF2-B1):

Deploy SHIELD
Generate an RSA keypair offline
Create a Tenant; as that Tenant:
Fill out the "Provision New Agent" form (providing the public key)
Deploy Agent (and data system, presumably) via BOSH, using -v agent_key=$y

This has the disadvantage that it does not work under BOSH's runtime-config, so it is impossible to colocate the agent transparently based on deployment composition. Ideally, I should be able to do this:

# runtime-config.yml
variables:
  - name: shield-agent-key
    type: rsa

addons:
  - name: shield-agent
    include:
      - jobs: [{ release: postgres, name: postgres }]
    jobs:
      - name: shield-agnt
         release: shield
         properties:
           key: ((shield-agent-key.private_key))

And every deployment would get a unique key. To make this possible, we have to amend the application of WF1 on BOSH, resulting in (WF2-B2):

Deploy SHIELD
Deploy Agent (and data system, presumably) via BOSH, leveraging CredHub (via the runtime-config) to get the key
Retrieve the RSA public key from CredHub
Log back into SHIELD and fill out the "Provision New Agent" form (providing the public key)

The added bounce through CredHub involves a new system (with a CLI we have heretofore not needed). This is more complicated, but not untenable.

Kubenetes

Let's turn to Kubernetes. A bespoke "vanilla" deployment (i.e. not using any CRDs, Operators, or ex post facto configuration) looks like this for the proposed workflow (WFP-K1):

Deploy SHIELD
Create a Tenant (to get its UUID)
Generate an RSA keypair offline (ssh-keygen -t rsa -f id_rsa and kubectl create secret generic my-secret --from-file=ssh-privatekey=$PWD/id_rsa
Deploy the Agent with the private key (via kubectl apply -f ...)
Approve the agent registration in the SHIELD Core (for the Tenant)

Applying WF2 to Kubernetes is more of the same from the non-runtime-config BOSH story, (WF2-K1):

Deploy SHIELD
Generate an RSA keypair offline (ssh-keygen -t rsa -f id_rsa and kubectl create secret generic my-secret --from-file=ssh-privatekey=$PWD/id_rsa
Create a Tenant; as that Tenant:
Fill out the "Provision New Agent" form (providing the public key)
Deploy the Agent with the private key (via kubectl apply -f ...)

A (Secure) Compromise

I believe the crux of the disagreement between people who prefer WF2 over WFP boils down to agent autonomy, which is a policy decision that is best left to operators (we should concern ourselves with Mechanism, not Policy).

The WFP workflow gives greater power/convenience to people deploying agents, whereas the WF2 workflow gives greater control to the people operating tenants.

In the spirit of mechanism, not policy, what if we do this:

Continue to identify agents by identity and private key (i.e. we authorize keys for subsets of identity)
Enable SHIELD site administrators to enable or disable Agent-originated registration.

I think this allows us to support both workflows.

If you are convenience-minded: enable automatic registration and let your deployment tooling generate unique keys and identities (given a tenant UUID to start from). When it is time to approve keys that show up in the web interface, you can either blindly approve them (super-convenience-minded) or verify the public key against what you think you deployed (security+convience, or the trust but verify model).

If you are security-minded: disable automatic registration and manually provision all of your keys ahead of time.

Some Historical Context

The two real-world systems I've been basing this analysis (and indeed most of the design of the agent registration protocol) are Puppet auto-enrollment and SSH Host Keys.

Section 4.1 of RFC-4251 deals with the (now common) practice of trust-on-first-use:

The protocol provides the option that the server name - host key
association is not checked when connecting to the host for the first
time. This allows communication without prior communication of host
keys or certification. The connection still provides protection
against passive listening; however, it becomes vulnerable to active
man-in-the-middle attacks. Implementations SHOULD NOT normally allow
such connections by default, as they pose a potential security
problem. However, as there is no widely deployed key infrastructure
available on the Internet at the time of this writing, this option
makes the protocol much more usable during the transition time until
such an infrastructure emerges, while still providing a much higher
level of security than that offered by older solutions (e.g., telnet
[RFC0854] and rlogin [RFC1282]).

jhunt · 2019-10-04T15:03:53Z

Hearing no concerns, complaints, or rebuttals, we will adopt the following:

Continue to identify agents by identity and private key (i.e. we authorize keys for subsets of identity)
Enable SHIELD site administrators to enable or disable Agent-originated registration.

This allows us to support both workflows.

If you are convenience-minded: enable automatic registration and let your deployment tooling generate unique keys and identities (given a tenant UUID to start from). When it is time to approve keys that show up in the web interface, you can either blindly approve them (super-convenience-minded) or verify the public key against what you think you deployed (security+convience, or the trust but verify model).

If you are security-minded: disable automatic registration and manually provision all of your keys ahead of time.

jhunt added ssh-active-fabric and removed ssh-active-fabric labels Sep 7, 2019

jhunt added this to the SSH Active Fabric milestone Sep 7, 2019

jhunt mentioned this issue Sep 7, 2019

Agent Uniqueness Concerns #1

Closed

jhunt added the resolved This concern / issue / complaint has been resolved. label Oct 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Private Keys for Agent ↔ Tenant association during registration #4

Private Keys for Agent ↔ Tenant association during registration #4

jhunt commented Sep 7, 2019

jhunt commented Sep 7, 2019

jhunt commented Oct 4, 2019

Private Keys for Agent ↔ Tenant association during registration #4

Private Keys for Agent ↔ Tenant association during registration #4

Comments

jhunt commented Sep 7, 2019

jhunt commented Sep 7, 2019

BOSH

Kubenetes

A (Secure) Compromise

Some Historical Context

jhunt commented Oct 4, 2019