Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terragrunt does not renew the AWS session token automatically when the token is expired #3817

Open
2 tasks
gqrlt1207 opened this issue Jan 29, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@gqrlt1207
Copy link
Contributor

Describe the bug

when we tried to use 'terragrunt' to perform AWS RDS cluster migration from one region to another region, the AWS session token expired because the task took more than one hour to complete. this caused the terraform state file not to be updated in the AWS s3 bucket. ( we save all the terraform state files in the s3 bucket).

we run the 'terragrunt' command in Kubernetes pods and use IRSA to access the AWS resources. this makes it impossible to set the duration of the AWS session token to more than 1 hour.

Steps To Reproduce

Steps to reproduce the behavior, code snippets and examples which can be used to reproduce the issue.

Be sure that the maintainers can actually reproduce the issue. Bug reports that are too vague or hard to reproduce are hard to troubleshoot and fix.

// paste code snippets here

Expected behavior

A clear and concise description of what you expected to happen.

Nice to haves

  • Terminal output
  • Screenshots

Versions

  • Terragrunt version:
  • OpenTofu/Terraform version:
  • Environment details (Ubuntu 20.04, Windows 10, etc.):

Additional context

Add any other context about the problem here.

@gqrlt1207 gqrlt1207 added the bug Something isn't working label Jan 29, 2025
@yhakbar
Copy link
Collaborator

yhakbar commented Jan 29, 2025

Hey @gqrlt1207 ,

How are you assuming the role? We might need more details.

Generally speaking, if the role assumption expires in the middle of an OpenTofu/Terraform run, preventing OpenTofu/Terraform from pushing state after an apply, there won't be anything Terragrunt can do. At that stage, OpenTofu/Terraform is in control of the run.

I believe you can adjust the limit so that roles are assumable for longer than an hour, by the way:

² This setting can have a value from 1 hour to 12 hours. For details about modifying the maximum session duration setting, see IAM role management. This setting determines the maximum session duration that you can request when you get the role credentials. For example, when you use the AssumeRole* API operations to assume a role, you can specify a session length using the DurationSeconds parameter. Use this parameter to specify the length of the role session from 900 seconds (15 minutes) up to the maximum session duration setting for the role. IAM users who switch roles in the console are granted the maximum session duration, or the remaining time in their user session, whichever is less. Assume that you set a maximum duration of 5 hours on a role. An IAM user that has been signed into the console for 10 hours (out of the default maximum of 12) switches to the role. The available role session duration is 2 hours. To learn how to view the maximum value for your role, see Update the maximum session duration for a role later in this page.

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_manage-assume.html

@gqrlt1207
Copy link
Contributor Author

thanks, @yhakbar , actually, we run everything in the Kubernetes container and use the IRSA approach to get access to the AWS. AWS set a 1-hour limit for the generated temporary, which can not be prolonged.

Based on the above restriction, is it possible to start a go-routine once the 'terragrunt' command is called. and this go-routine will watch the environment variables AWS_ACCESS_KEY, AWS_ACCESS_SECRECT, AWS_SESSION_TOKEN, once they are updated, the 'terragrunt' will automatically assume the role using the refreshed AWS session token.

what do you think?

@yhakbar
Copy link
Collaborator

yhakbar commented Jan 31, 2025

@gqrlt1207

What you're asking for does happen when using the auth-provider-cmd, but it won't help you if your role expires in the middle of an OpenTofu/Terraform run.

When Terragrunt spawns the process for OpenTofu/Terraform, environment variables and everything else are set, and OpenTofu/Terraform are in control from there until they finish, and Terragrunt does something else like spawn another process for another unit or something. The problem you're encountering is that in the middle of your OpenTofu/Terraform invocation, credentials are expiring, so OpenTofu/Terraform need to be the thing refreshing the credentials, or you set the expiration high enough that it doesn't matter.

The docs I linked above show you how to change the maximum limit for role assumption duration in OIDC mediated assumed roles, which is my understanding of how IRSA works, though I might be mistaken.

@gqrlt1207
Copy link
Contributor Author

thanks @yhakbar, in our environment, we run 'terragrunt' in the Kubernetes pod, Kubernetes uses the IRSA to generate the temporary AWS session token for the pod to communicate with AWS. in terragrant.hcl file, we set the iam-role which terragrunt will assume by using the temporary AWS session token.

there is the role-chaining limit in the AWS side in consideration of security. When we use temporary AWS session tokens to assume other IAM roles, the maximum token duration is 1 hour. this is what happens in our environment. Is it possible for 'terragrunt' to use the 'role' and 'token' file from the pod to assume the role in 'terragrunt'?

@yhakbar
Copy link
Collaborator

yhakbar commented Feb 5, 2025

@gqrlt1207 yup!

I recommend reading the docs for the auth-provider-cmd. It gives you very fine grain control over how Terragrunt assumes roles. You can write a small scripts that extracts those values, and sends them to Terragrunt like this:

  "awsRole": {
    "roleARN": "role-acquired-from-pod",
    "sessionName": "my-session-name",
    "duration": 3600,
    "webIdentityToken": "token-acquired-from-pod"
  },

@gqrlt1207
Copy link
Contributor Author

thanks @yhakbar , I will try this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants