Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement] Provide a automated way to renew the token if existing token expires for sveltoscluster mgmt object. #995

Open
jasvinder1107 opened this issue Jan 30, 2025 · 0 comments
Labels
enhancement Small feature, request or improvement suggestion

Comments

@jasvinder1107
Copy link

jasvinder1107 commented Jan 30, 2025

Use-Case: If management cluster has some maintenance activity or hardware issue causing it go down for more than 3 hours. The token for mgmt sveltoscluster object expires. This lead to continuous errors and failure to reconcile the management cluster seveltoscluster object. The documentation states that if the token is expired the tokenRequestRenewalOption object is useless.
https://projectsveltos.github.io/sveltos/register/token-renewal/

If, for any reason, token rotation cannot happen before the current token expires, the sveltoscluster-manager can no longer update the token. Consequently, reconciliations for that cluster stop, and you must manually update the Secret for that cluster to restore functionality.
2025-01-30T17:22:02.617497+00:00 bigbender k0s[1216]: time="2025-01-30 17:22:02" level=info msg="E0130 17:22:02.617216    1290 authentication.go:73] \"Unable to authenticate the request\" err=\"[invalid bearer token, service account token has expired]\"" component=kube-apiserver stream=stderr
2025-01-30T17:22:02.645808+00:00 bigbender k0s[1216]: time="2025-01-30 17:22:02" level=info msg="E0130 17:22:02.645588    1290 authentication.go:73] \"Unable to authenticate the request\" err=\"[invalid bearer token, service account token has expired]\"" component=kube-apiserver stream=stderr
2025-01-30T17:22:02.684750+00:00 bigbender k0s[1216]: time="2025-01-30 17:22:02" level=info msg="E0130 17:22:02.684503    1290 authentication.go:73] \"Unable to authenticate the request\" err=\"[invalid bearer token, service account token has expired]\"" component=kube-apiserver stream=stderr
2025-01-30T17:22:02.741968+00:00 bigbender k0s[1216]: time="2025-01-30 17:22:02" level=info msg="E0130 17:22:02.741191    1290 authentication.go:73] \"Unable to authenticate the request\" err=\"[invalid bearer token, service account token has expired]\"" component=kube-apiserver stream=stderr
2025-01-30T17:22:02.781249+00:00 bigbender k0s[1216]: time="2025-01-30 17:22:02" level=info msg="E0130 17:22:02.780966    1290 authentication.go:73] \"Unable to authenticate the request\" err=\"[invalid bearer token, service account token has expired]\"" component=kube-apiserver stream=stderr

The job responsible for this cluster object has hard-coded timeout set for first token as 3 hours and consecutive renewal as 1 hour
https://github.com/projectsveltos/register-mgmt-cluster/blob/8c1dacccf04c932566bce543e4583a612bd5962e/cmd/main.go#L439C1-L439C65
https://github.com/projectsveltos/register-mgmt-cluster/blob/8c1dacccf04c932566bce543e4583a612bd5962e/cmd/main.go#L120

The jobs also has https://github.com/projectsveltos/helm-charts/blob/b2652715ee7b3e373705941f45e9294007a2f2f7/charts/projectsveltos/templates/register-mgmt-cluster-job.yaml#L36 ttlSecondsAfterFinished: 240 set which mean after 4 mins the job will be gone. This makes no room for customer to fix this issue.

root@bigbender:~/projectsveltos# kubectl get sveltoscluster -n mgmt  -o yaml
apiVersion: v1
items:
- apiVersion: lib.projectsveltos.io/v1beta1
  kind: SveltosCluster
  metadata:
    creationTimestamp: "2025-01-23T20:33:09Z"
    generation: 14
    labels:
      projectsveltos.io/k8s-version: v1.31.5
      sveltos-agent: present
    name: mgmt
    namespace: mgmt
    resourceVersion: "1114505"
    uid: e1fd1a1b-f7e8-4f86-8e6c-bb938d282c4c
  spec:
    consecutiveFailureThreshold: 3
    kubeconfigKeyName: re-kubeconfig
    tokenRequestRenewalOption:
      renewTokenRequestInterval: 1h0m0s
  status:
    connectionFailures: 48804
    connectionStatus: Down
    failureMessage: 'failed to get API group resources: unable to retrieve the complete
      list of server APIs: v1: Unauthorized'
    lastReconciledTokenRequestAt: "2025-01-29T21:25:46Z"
    ready: true
    version: v1.31.5+k0s
kind: List
metadata:
  resourceVersion: ""
root@bigbender:~/projectsveltos# 

You can see the failureMessage there is "failed to get API group resources: unable to retrieve the complete"

Possible solutions:

1.Handle this as a cronjob build a bash script and run a seperate job in https://github.com/k0rdent/kcm/tree/main/templates/provider/projectsveltos
2. Change in upstream helm chart and maintaning it under K0rdent , doesn't sound like a good idea.
3. I am Open to discussion on this to have seperate pod and go code to handle this.

Workaround currently is:

  1. Get the current kubeconfig from secret.
 kubectl get secret -n mgmt mgmt-sveltos-kubeconfig -o json |jq '.data."re-kubeconfig"' |tr -d "\"" |base64 -d  > file1;
  1. Run sveltosctl to generate kubeconfig. Copy the generated token and replace it in file1. Reason for not saving whole file and just taking a token is apisever ip generated by this command is localhost:6443 . Just wanted to update, what is minimum required. The actual kubeconfig generated by cluster object has service IP.
 root@bigbender:~# sveltosctl  generate kubeconfig --create --expirationSeconds=86400 |grep -i token
  1. Run below patch command to fix the token
kubectl patch secret -n mgmt mgmt-sveltos-kubeconfig --patch="{\"data\": { \"re-kubeconfig\": \"$(base64 -w0 ./file1)\" }}"

If you can see things will start working and no more token expire messages filling up system logs.

 
root@bigbender:~# kubectl patch secret -n mgmt mgmt-sveltos-kubeconfig --patch="{\"data\": { \"re-kubeconfig\": \"$(base64 -w0 ./file1)\" }}"
secret/mgmt-sveltos-kubeconfig patched
root@bigbender:~#
root@bigbender:~#
root@bigbender:~# kubectl get sveltoscluster -A
NAMESPACE   NAME   READY   VERSION
mgmt        mgmt   true    v1.31.5+k0s
root@bigbender:~# kubectl get sveltoscluster -n mgmt -o yaml
apiVersion: v1
items:
- apiVersion: lib.projectsveltos.io/v1beta1
  kind: SveltosCluster
  metadata:
    creationTimestamp: "2025-01-23T20:33:09Z"
    generation: 14
    labels:
      projectsveltos.io/k8s-version: v1.31.5
      sveltos-agent: present
    name: mgmt
    namespace: mgmt
    resourceVersion: "1131771"
    uid: e1fd1a1b-f7e8-4f86-8e6c-bb938d282c4c
  spec:
    consecutiveFailureThreshold: 3
    kubeconfigKeyName: re-kubeconfig
    tokenRequestRenewalOption:
      renewTokenRequestInterval: 1h0m0s
  status:
    connectionStatus: Healthy
    lastReconciledTokenRequestAt: "2025-01-30T17:46:16Z"
    ready: true
    version: v1.31.5+k0s
kind: List
metadata:
  resourceVersion: ""
root@bigbender:~#


@jasvinder1107 jasvinder1107 added the enhancement Small feature, request or improvement suggestion label Jan 30, 2025
@jasvinder1107 jasvinder1107 changed the title [enhancement] Provide a automated way to renew the token if existing token expires fore sveltoscluster mgmt object. [enhancement] Provide a automated way to renew the token if existing token expires for sveltoscluster mgmt object. Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Small feature, request or improvement suggestion
Projects
None yet
Development

No branches or pull requests

1 participant