Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flyte-core helm charts for multi cluster configuration #3993

Merged
merged 5 commits into from
Sep 27, 2023

Conversation

gdabisias
Copy link
Contributor

@gdabisias gdabisias commented Aug 28, 2023

While testing the flyte-core deployment, I found that there is an issue with secrets access describe in the Flyte multicluster guide.
Basically we are creating manually a secret called cluster-secrets, but this does not get mounted properly by the services that are supposed to read from it, resulting in a PANIC error in syncresources service and in the sync-cluster-resources init container of the flyte admin service:

panic: Failed to get auth token: open /var/run/credentials/cluster_1_token: no such file or directory
failed to read k8s bearer token from configured path

I described the pods and notice that the secrets don't get mounted correctly and that we have a special case for when we actually add a multi cluster configuration

{{- if gt (len .Values.configmap.clusters.labelClusterMap) 0 }}
...

but that section is loading the flyte-admin-secrets secret instead of the new cluster-secret.

This is a fix to have things working for how they are set up right now, but longer term there should not be this custom secret and we should have a pre-created secret in the charts that gets filled in by users from the values-eks.yaml file or something similar (e.g. reuse flyte-admin-secrets)

Tracking issue

This issue is linked to #3970 but is not fixing that

Check all the applicable boxes

  • I updated the documentation accordingly. -> No documentation to update
  • All new and existing tests passed -> Tested this in a real world multicluster setup
  • All commits are signed-off.

@welcome
Copy link

welcome bot commented Aug 28, 2023

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

Signed-off-by: gdabisias <[email protected]>
Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the help!

@@ -72,6 +72,10 @@ spec:
name: clusters-config-volume
- mountPath: /etc/secrets/
name: admin-secrets
{{- if gt (len .Values.configmap.clusters.labelClusterMap) 0 }}
- mountPath: /var/run/credentials
name: cluster-credentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this is a volumeMount right? doesn't that mean there needs to be a volume declared with the same name in the Pod?

Copy link
Contributor Author

@gdabisias gdabisias Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because this is done in the guide as part of the multicluster setup (not ideal, but following what is currently there)
https://docs.flyte.org/en/latest/deployment/deployment/multicluster.html#id2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see it's in additionalVolumes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but wait shouldn't that come through in the additional mounts?

  additionalVolumeMounts:
  - name: cluster-credentials
    mountPath: /var/run/credentials

which gets injected here: https://github.com/flyteorg/flyte/blob/a071bade39bdad80fff042c235a1c3c046a82a09/charts/flyte-core/templates/admin/deployment.yaml#L137C37-L137C59

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gdabisias I just found that if we change the process to, instead of creating a new secret, editing the existing flyte-admin-secrets and adding the data plane cluster token and cert there, the syncresources Pod works just fine.
Right, no surprises there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidmirror-ops as you mentioned in the other comment, we should have a separate secret

Copy link
Contributor Author

@gdabisias gdabisias Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wild-endeavor no because here we are updating the sync-cluster-resources init container and not the admin one. For the admin one, we add the secret mounting point to the values-eks.yaml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we reference Values.flyteadmin.additionalVolumeMounts also instead of directly adding the mountpath?

The issue is that this only works if the user specifies that string in the values file for the additional volumes. if they use a different string than "cluster-credentials" then this won't work right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it won't work. My idea was to fix things according to the guide, but we can also just add that to the values-eks.yaml.
Up to you, I don't mind. Either way, we should remove this secret creation and mounting completely and it should be part of the general chart, with the user only adding the secret value itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change but notice that we are mounting anything specified in the additional volumes section, so there might be also some other stuff (Still better than what we had before and I don't see why something mounted by the admin container should not be mounted by it's init container)

Copy link
Contributor

@davidmirror-ops davidmirror-ops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reproduced this and it works

@davidmirror-ops davidmirror-ops merged commit b52bbe2 into flyteorg:master Sep 27, 2023
15 checks passed
@welcome
Copy link

welcome bot commented Sep 27, 2023

Congrats on merging your first pull request! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants