Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add strict TLS mode support #2507

Merged
merged 7 commits into from
Jun 26, 2024
Merged

Add strict TLS mode support #2507

merged 7 commits into from
Jun 26, 2024

Conversation

weyfonk
Copy link
Contributor

@weyfonk weyfonk commented Jun 11, 2024

Refers to #2171

This adds a new Helm value named agentTLSMode, with two supported values:

  • system-store, the default: in this mode, the Fleet agent behaves as it has so far, trusting certificates signed by any CA installed in the system store when registering against an upstream cluster.
    • In this mode, Fleet will also ignore a configured CA, if the system trust store is sufficient.
  • strict, to ignore the system store during the cluster registration process

Updating that value in the fleet-controller config map triggers redeployment of the Fleet agent, on the upstream cluster and on downstream clusters which had been registered following a manager-initiated process (as would typically be the case when importing clusters through Rancher). This does not work for agent-initiated registration.

Open points:

  • bypassing the system-wide CA store is done by setting environment variables SSL_CERT_FILE and SSL_CERT_DIR, mentioned here, to /dev/null. This is admittedly a bit hacky. Is there a cleaner way, keeping in mind that Fleet agent containers run with a read-only filesystem?

@weyfonk weyfonk requested a review from a team as a code owner June 11, 2024 16:06
@weyfonk weyfonk force-pushed the 2171-strict-tls branch 4 times, most recently from 2a12174 to 2c6a86e Compare June 13, 2024 07:24
@weyfonk weyfonk marked this pull request as draft June 13, 2024 08:51
@weyfonk weyfonk force-pushed the 2171-strict-tls branch 6 times, most recently from ce0e72e to bd6e18f Compare June 14, 2024 16:50
@weyfonk weyfonk marked this pull request as ready for review June 14, 2024 16:50
@weyfonk weyfonk force-pushed the 2171-strict-tls branch 2 times, most recently from 0d5cf8e to 585fd48 Compare June 17, 2024 07:27
p-se
p-se previously approved these changes Jun 25, 2024
weyfonk added 7 commits June 26, 2024 10:15
Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store
* `strict`, to bypass the system store when validating a certificate.
This adds agent TLS mode tests to the multi-cluster e2e workflow.
This uses the Helm CLI to install the Fleet agent on the downstream
cluster, instead of relying on an external script which is subject to
changes.
Logic used to delete previous Fleet agent installs now deletes the
`cattle-fleet-system` namespace as well, which leaves a clean slate for
further test cases.
Script `./dev/setup-fleet-downstream` is no longer needed by
multi-cluster end-to-end test cases for the agent's strict TLS mode.
This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
This implements cleanup, enabling agent TLS mode tests to be run
multiple times and not necessarily after all other tests.
This documents where environment variables used to bypass the store come
from, and sets them to `/dev/null` to make the absence of usable
values/cert files more explicit.
@weyfonk weyfonk merged commit de3f46a into rancher:main Jun 26, 2024
8 checks passed
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 2, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 2, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 3, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 3, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 3, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit that referenced this pull request Jul 4, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit that referenced this pull request Jul 4, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit that referenced this pull request Jul 4, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
weyfonk added a commit to weyfonk/fleet that referenced this pull request Jul 5, 2024
* Add agentTLSMode option

Fleet now supports two distinct TLS mode for its agent when registering
against an upstream cluster:
* `system-store`, the default, does not change its current behaviour:
  the Fleet agent trusts any certificate signed by a CA found in its
  system store. In this mode, Fleet will also ignore a configured CA,
  if the system trust store is sufficient.
* `strict`, to bypass the system store when validating a certificate.

* Redeploy Fleet agent when TLS mode setting changes

This commit takes care of watching the agent TLS mode setting in the
`fleet-controller` config map, and of redeploying the Fleet agent to
upstream and downstream clusters when that setting changes.
Note that this only works for downstream clusters registered through a
manager-initiated process [1].

Testing this is done by reusing existing agent TLS mode test cases, and
triggering new deployments of the Fleet agent by patching the
`fleet-controller` config map.
Requirements for this include a cluster registered in manager-initiated
mode, while existing multi-cluster end-to-end tests need a downstream
cluster registered in agent-initiated mode.
Therefore, this commit also adds a new downstream cluster to the
multi-cluster CI workflow, which is so far only used for agent TLS mode
tests.

[1]: https://fleet.rancher.io/cluster-registration#manager-initiated
thardeck added a commit that referenced this pull request Jul 25, 2024
thardeck added a commit that referenced this pull request Jul 25, 2024
[v0.8] Revert "Add strict TLS mode support (#2507)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants