Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry flow for permission exceptions within Terra Run tasks #15

Open
marshall7m opened this issue Aug 20, 2022 · 0 comments
Open

Add retry flow for permission exceptions within Terra Run tasks #15

marshall7m opened this issue Aug 20, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@marshall7m
Copy link
Owner

marshall7m commented Aug 20, 2022

Problem
If a Terra Run task fails within the Step Function execution because of invalid IAM permissions (e.g permissions for terraform apply cmd), the Step Function execution will end without any opportunity for an admin user to fix the IAM permissions and retry the Terra Run task. This results in the user having to manually start a new execution and possibly send out a new approval request if the permission error was within the Terra Run Apply state.

Possible Solutions
A:

  • Add a catch block to the Plan and Apply states that specifically catch States.Permissions exceptions and trigger a downstream SNS state that will notify an admin user via their preferred method (e.g. email, mobile, eventually Slack?). Notification can include contextual information such as execution ID, PR #, and more importantly the specific IAM permissions that are needed (use auto-generated policy via pike?). Once the admin user confirms that changes are made to the permissions, Step Function execution will run a retry state of the associated Terra Run state.

B:

  • Within this Terraform module, provision an internal CodeCommit repository that's directory structure will be synchronized with the live infrastructure repo. The directories will contain Terraform IAM policy resources that specify the permissions needed to run terraform apply within the live repo. We'll call this the "policy repo" and call the infrastructure-live repo "live repository".
  • When a PR is created, the PR plan ECS task will generate the terraform apply IAM permission resources and create a PR within the policy repo that has the proposed IAM policies needed for each of the Terraform directories that contain added/modified .tf files.
  • When the PR for the live repository is merged, the policy repo will merge it's associated PR.
  • Within the Step Function definition, the definition will contain a parallel task containing a Terra Run Plan task for both policy and live repo Terraform configurations.
  • If the required approval count is met, a the Terra Run Apply task will run for the policy repo's respective Terraform directory.
  • Once the policy's Apply task is successful, the Terra Run Apply task will run for the live repo's respective Terraform directory.

At a high level, this would look like:

Screen Shot 2022-08-20 at 1 06 15 PM

@marshall7m marshall7m added the enhancement New feature or request label Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant