Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace init secrets init container with helm #5557

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ddl-ebrown
Copy link
Contributor

@ddl-ebrown ddl-ebrown commented Jul 12, 2024

Tracking issue

Why are the changes needed?

Removes deployment complexity

A couple of other things to call out here:

  • It may be useful to deprecate / remove the code previously used to generate secrets if it's not otherwise needed (not sure what the standalone binary chart does for init)
  • The ClusterRole should be updated to remove the ability to write / delete secrets - as it stands, https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values.yaml#L71-L94 is extremely permissive and a bit of a security liability (why does flyteadmin need access to all secrets in the cluster?). Using Helm to init the secret allows for restricting the service account more appropriately

What changes were proposed in this pull request?

Remove core flyteadmin init-secret initContainer

  • A separate container instance that requires Kubernetes API access is
    unnecessary when Helm is able to generate the same secret values in
    the client before submitting resources to Kubernetes.

    The declarative approach here is identical to what the code does in
    https://github.com/flyteorg/flyte/blob/master/flyteadmin/auth/init_secrets.go#L80-L151

  • The lookup function is used so that on upgrades, the secret is not
    regenerated -- the existing values in the cluster are used.

  • Note that Helm always creates secret resources before deployment
    resources, so the secret values are guaranteed to be available before
    flyteadmin starts

Formalize oidc_client_secret in flyte-admin-secrets

  • When setting up integrations with an IdP like Keycloak, the
    oidc_client_secret must also be set in flyte-admin-secrets.

    Formalize that and make it more discoverable

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented Jul 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.97%. Comparing base (81afb76) to head (b3e424e).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5557      +/-   ##
==========================================
- Coverage   60.98%   60.97%   -0.02%     
==========================================
  Files         796      796              
  Lines       51647    51647              
==========================================
- Hits        31498    31492       -6     
- Misses      17249    17255       +6     
  Partials     2900     2900              
Flag Coverage Δ
unittests-datacatalog 69.31% <ø> (ø)
unittests-flyteadmin 58.69% <ø> (-0.05%) ⬇️
unittests-flytecopilot 17.79% <ø> (ø)
unittests-flytectl 67.44% <ø> (ø)
unittests-flyteidl 79.06% <ø> (ø)
unittests-flyteplugins 61.85% <ø> (ø)
unittests-flytepropeller 57.42% <ø> (ø)
unittests-flytestdlib 65.68% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ddl-ebrown ddl-ebrown force-pushed the replace-init-secrets-initContainer-with-helm branch 2 times, most recently from 873cbab to fc035cc Compare July 12, 2024 05:20
@@ -56,7 +56,8 @@ ${GOPATH:-~/go}/bin/helm-docs -c ${DIR}/../charts/

# This section is used by GitHub workflow to ensure that the generation step was run
if [ -n "$DELTA_CHECK" ]; then
DIRTY=$(git status --porcelain)
# find only deleted or removed lines, not replaced values
DIRTY=$(git diff --word-diff | grep '^[{\[]')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI check needs to be updated to understand that some values may change when regenerating Helm charts

An alternative approach would be to add values.yaml entries and plumb them through -- but that feels a bit overkill

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @eapolinario please take a look

@ddl-ebrown ddl-ebrown force-pushed the replace-init-secrets-initContainer-with-helm branch from fc035cc to 4a95b31 Compare July 12, 2024 05:46
 - A separate container instance that requires Kubernetes API access is
   unnecessary when Helm is able to generate the same secret values in
   the client before submitting resources to Kubernetes.

   The declarative approach here is identical to what the code does in
   https://github.com/flyteorg/flyte/blob/master/flyteadmin/auth/init_secrets.go#L80-L151

 - The lookup function is used so that on upgrades, the secret is not
   regenerated -- the existing values in the cluster are used.

 - Note that Helm always creates secret resources before deployment
   resources, so the secret values are guaranteed to be available before
   flyteadmin starts

 - Update helm chart regenerate check so that it only fails if the diff
   has new or removed lines

Signed-off-by: ddl-ebrown <[email protected]>
 - When setting up integrations with an IdP like Keycloak, the
   oidc_client_secret must also be set in flyte-admin-secrets.

   Formalize that and make it more discoverable

Signed-off-by: ddl-ebrown <[email protected]>
@ddl-ebrown ddl-ebrown force-pushed the replace-init-secrets-initContainer-with-helm branch from 4a95b31 to b3e424e Compare July 12, 2024 20:04
Copy link
Contributor

@davidmirror-ops davidmirror-ops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also reproduced this and works as expected. Thank you!

@ddl-ebrown do you think you could PR the cleanup tasks you mention in the description? I think it makes sense:

It may be useful to deprecate / remove the code previously used to generate secrets if it's not otherwise needed (not sure what the standalone binary chart does for init)
The ClusterRole should be updated to remove the ability to write / delete secrets - as it stands, https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values.yaml#L71-L94 is extremely permissive and a bit of a security liability (why does flyteadmin need access to all secrets in the cluster?). Using Helm to init the secret allows for restricting the service account more appropriately

@ddl-ebrown
Copy link
Contributor Author

Also reproduced this and works as expected. Thank you!

@ddl-ebrown do you think you could PR the cleanup tasks you mention in the description? I think it makes sense:

It may be useful to deprecate / remove the code previously used to generate secrets if it's not otherwise needed (not sure what the standalone binary chart does for init)
The ClusterRole should be updated to remove the ability to write / delete secrets - as it stands, master/charts/flyte-core/values.yaml#L71-L94 is extremely permissive and a bit of a security liability (why does flyteadmin need access to all secrets in the cluster?). Using Helm to init the secret allows for restricting the service account more appropriately

Happy to remove any dead code used to generate these values -- that should be easy.

I'd like someone on your side to confirm the ClusterRole changes though, and I think it might be best to do a separate pass on cleaning up the default ClusterRole config in another PR since the issue is a little orthogonal (my hunch is that a bunch of those perms are unnecessary for flyteadmin). What do you think?

@davidmirror-ops
Copy link
Contributor

Also reproduced this and works as expected. Thank you!
@ddl-ebrown do you think you could PR the cleanup tasks you mention in the description? I think it makes sense:

It may be useful to deprecate / remove the code previously used to generate secrets if it's not otherwise needed (not sure what the standalone binary chart does for init)
The ClusterRole should be updated to remove the ability to write / delete secrets - as it stands, master/charts/flyte-core/values.yaml#L71-L94 is extremely permissive and a bit of a security liability (why does flyteadmin need access to all secrets in the cluster?). Using Helm to init the secret allows for restricting the service account more appropriately

Happy to remove any dead code used to generate these values -- that should be easy.

I'd like someone on your side to confirm the ClusterRole changes though, and I think it might be best to do a separate pass on cleaning up the default ClusterRole config in another PR since the issue is a little orthogonal (my hunch is that a bunch of those perms are unnecessary for flyteadmin). What do you think?

Yep, the changes to the ClusterRole should be a separate PR, as the removal of the unused code.

@ddl-ebrown
Copy link
Contributor Author

Also reproduced this and works as expected. Thank you!
@ddl-ebrown do you think you could PR the cleanup tasks you mention in the description? I think it makes sense:

It may be useful to deprecate / remove the code previously used to generate secrets if it's not otherwise needed (not sure what the standalone binary chart does for init)
The ClusterRole should be updated to remove the ability to write / delete secrets - as it stands, master/charts/flyte-core/values.yaml#L71-L94 is extremely permissive and a bit of a security liability (why does flyteadmin need access to all secrets in the cluster?). Using Helm to init the secret allows for restricting the service account more appropriately

Happy to remove any dead code used to generate these values -- that should be easy.
I'd like someone on your side to confirm the ClusterRole changes though, and I think it might be best to do a separate pass on cleaning up the default ClusterRole config in another PR since the issue is a little orthogonal (my hunch is that a bunch of those perms are unnecessary for flyteadmin). What do you think?

Yep, the changes to the ClusterRole should be a separate PR, as the removal of the unused code.

Ok, so then to get this one merged we just need to get the CI check green

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Approved yet unmerged PRs
Development

Successfully merging this pull request may close these issues.

2 participants