-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EKS in AWS Curvenote account #2652
Changes from 45 commits
7e44698
6518eab
54057f1
ece451d
de030c8
408510a
e061b7c
16cccd9
8504574
e964117
c872505
72411d6
565ebef
d1feb31
0ccb81f
a94c8de
a2b83b6
b07dd43
de4e919
d5236eb
693bb50
2449223
b91432b
127eb63
f44c97c
2089dac
604d592
ef445dd
77e9198
814891d
7a98cf2
995fa77
0978ed5
5baff90
c6d21a1
19213de
80bdbe2
6cb8d3c
02cb10f
ea59795
55dd529
2d4c439
a735a08
432f115
aebaa80
8c1cfb1
d7036e0
af3d573
315ad74
f48c4f7
55666c1
1851b2a
c00c237
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# See terraform/aws/curvenote/README.md | ||
name: Terraform aws-curvenote | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
- aws-curvenote2 | ||
manics marked this conversation as resolved.
Show resolved
Hide resolved
|
||
paths: | ||
- "terraform/aws/curvenote/**" | ||
- .github/workflows/terraform-deploy.yml | ||
workflow_dispatch: | ||
|
||
# Only allow one workflow to run at a time | ||
concurrency: terraform-deploy-aws-curvenote | ||
|
||
env: | ||
TFPLAN: aws-curvenote.tfplan | ||
AWS_DEPLOYMENT_ROLE: arn:aws:iam::166088433508:role/binderhub-github-oidc-mybinderorgdeploy-terraform | ||
AWS_REGION: us-east-2 | ||
WORKDIR: ./terraform/aws/curvenote | ||
|
||
jobs: | ||
terraform-plan: | ||
runs-on: ubuntu-22.04 | ||
timeout-minutes: 10 | ||
# These permissions are needed to interact with GitHub's OIDC Token endpoint. | ||
permissions: | ||
id-token: write | ||
contents: read | ||
defaults: | ||
run: | ||
working-directory: ${{ env.WORKDIR }} | ||
outputs: | ||
apply: ${{ steps.terraform-plan.outputs.apply }} | ||
|
||
steps: | ||
- uses: actions/checkout@v3 | ||
|
||
- name: Configure AWS credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
role-to-assume: ${{ env.AWS_DEPLOYMENT_ROLE }} | ||
aws-region: ${{ env.AWS_REGION }} | ||
role-session-name: terraform-plan | ||
|
||
- name: Terraform plan | ||
id: terraform-plan | ||
run: | | ||
terraform init | ||
EXIT_CODE=0 | ||
terraform plan -out="${TFPLAN}" -detailed-exitcode || EXIT_CODE=$? | ||
if [ $EXIT_CODE -eq 0 ]; then | ||
echo "No changes" | ||
echo "apply=false" >> "$GITHUB_OUTPUT" | ||
elif [ $EXIT_CODE -eq 2 ]; then | ||
echo "Changes found" | ||
echo "apply=true" >> "$GITHUB_OUTPUT" | ||
else | ||
echo "Terraform plan failed" | ||
exit $EXIT_CODE | ||
fi | ||
|
||
- name: Encrypt plan | ||
if: steps.terraform-plan.outputs.apply == 'true' | ||
run: | | ||
echo ${{ secrets.TFPLAN_ARTIFACT_PASSPHRASE }} | gpg --batch --yes --passphrase-fd 0 --symmetric --cipher-algo AES256 --output "${TFPLAN}.gpg" "${TFPLAN}" | ||
|
||
- name: Upload plan | ||
if: steps.terraform-plan.outputs.apply == 'true' | ||
uses: actions/upload-artifact@v3 | ||
with: | ||
name: ${{ env.TFPLAN }} | ||
path: ${{ env.WORKDIR }}/${{ env.TFPLAN }}.gpg | ||
if-no-files-found: error | ||
|
||
terraform-apply: | ||
needs: | ||
- terraform-plan | ||
runs-on: ubuntu-22.04 | ||
timeout-minutes: 60 | ||
# This environment requires approval before the deploy is run. | ||
environment: aws-curvenote | ||
# These permissions are needed to interact with GitHub's OIDC Token endpoint. | ||
permissions: | ||
id-token: write | ||
contents: read | ||
defaults: | ||
run: | ||
working-directory: ${{ env.WORKDIR }} | ||
if: needs.terraform-plan.outputs.apply == 'true' | ||
|
||
steps: | ||
- uses: actions/checkout@v3 | ||
|
||
- name: Configure AWS credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
role-to-assume: ${{ env.AWS_DEPLOYMENT_ROLE }} | ||
aws-region: ${{ env.AWS_REGION }} | ||
role-session-name: terraform-apply | ||
|
||
- name: Download plan | ||
uses: actions/download-artifact@v3 | ||
with: | ||
name: ${{ env.TFPLAN }} | ||
path: ${{ env.WORKDIR }} | ||
|
||
- name: Decrypt plan | ||
run: | | ||
echo ${{ secrets.TFPLAN_ARTIFACT_PASSPHRASE }} | gpg --batch --yes --passphrase-fd 0 --decrypt --cipher-algo AES256 --output "${TFPLAN}" "${TFPLAN}.gpg" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use anything other than gpg? age perhaps? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm using gpg because it's pre-installed in the base environment. This is purely for internal GH workflow use- there's no way to privately pass an artefact between separate jobs, and since this is a public repo it needs to be encrypted. It's not intended to be used by people/maintainers since if we need the full details we can run Terraform anyway, it's just a way to ensure the public can't view any sensitive information in the plan. My plan is to create a random GitHub secret There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd still love to get rid of it. I basically can't trust myself to ever verify if a GPG commandline is actually doing what it is supposed to do.
IMO this is much simpler to understand and verify, and given there's already a lot of complexity here I'd love to not touch GPG at all. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll see if I can get that to work. |
||
|
||
- name: Terraform apply | ||
run: | | ||
terraform init | ||
terraform apply "${TFPLAN}" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
name: Terraform static checks | ||
|
||
on: | ||
pull_request: | ||
paths: | ||
- "terraform/**" | ||
push: | ||
paths: | ||
- "terraform/**" | ||
workflow_dispatch: | ||
|
||
# We can't run CI tests on Terraform, so use as many static linters as possible | ||
|
||
jobs: | ||
terraform-pre-commit: | ||
runs-on: ubuntu-22.04 | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version-file: ".python-version" | ||
|
||
- name: Install dependencies | ||
run: pip install pre-commit | ||
|
||
# https://github.com/terraform-linters/setup-tflint | ||
- name: Install tflint | ||
uses: terraform-linters/[email protected] | ||
with: | ||
tflint_version: v0.47.0 | ||
|
||
- name: Run terraform pre-commit | ||
run: pre-commit run --all --config .pre-commit-config-terraform.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Config reference: https://pre-commit.com/#pre-commit-configyaml---top-level | ||
# | ||
# Common tasks | ||
# | ||
# - Run on all files: pre-commit run --all --config .pre-commit-config-terraform.yaml | ||
# | ||
# Prerequisites: | ||
# - terraform | ||
# - tflint | ||
|
||
# Currently only aws/ is checked | ||
files: "^terraform/aws/" | ||
exclude: "^terraform/aws/pangeo/" | ||
|
||
repos: | ||
# We can't run any CI tests on production Terraform code, so use as many static linters as possible | ||
- repo: https://github.com/antonbabenko/pre-commit-terraform | ||
rev: v1.83.0 | ||
hooks: | ||
- id: terraform_fmt | ||
- id: terraform_tflint | ||
- id: terraform_validate |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# BinderHub on AWS EKS | ||
|
||
This module deploys an AWS EKS cluster with IRSA roles to support BinderHub ECR access. | ||
|
||
The module has optional support for using a limited non-administrative AWS role with a permissions boundary to deploy the cluster. | ||
|
||
For an example see [curvenote](../curvenote/README.md) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/19.15.2 | ||
# Full example: | ||
# https://github.com/terraform-aws-modules/terraform-aws-eks/blame/v19.14.0/examples/complete/main.tf | ||
# https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v19.14.0/docs/compute_resources.md | ||
|
||
data "aws_caller_identity" "current" {} | ||
|
||
locals { | ||
permissions_boundary_arn = ( | ||
var.permissions_boundary_name != null ? | ||
"arn:aws:iam::${data.aws_caller_identity.current.account_id}:policy/${var.permissions_boundary_name}" : | ||
null | ||
) | ||
} | ||
|
||
# This assumes the EKS service linked role is already created (or the current user has permissions to create it) | ||
module "eks" { | ||
source = "terraform-aws-modules/eks/aws" | ||
version = "19.15.3" | ||
cluster_name = var.cluster_name | ||
cluster_version = var.k8s_version | ||
subnet_ids = module.vpc.public_subnets | ||
|
||
cluster_endpoint_private_access = true | ||
cluster_endpoint_public_access = true | ||
cluster_endpoint_public_access_cidrs = var.k8s_api_cidrs | ||
|
||
vpc_id = module.vpc.vpc_id | ||
|
||
# Allow all allowed roles to access the KMS key | ||
kms_key_enable_default_policy = true | ||
# This duplicates the above, but the default is the current user/role so this will avoid | ||
# a deployment change when run by different users/roles | ||
kms_key_administrators = [ | ||
"arn:aws:iam::${data.aws_caller_identity.current.account_id}:root", | ||
] | ||
|
||
enable_irsa = var.enable_irsa | ||
iam_role_permissions_boundary = local.permissions_boundary_arn | ||
|
||
eks_managed_node_group_defaults = { | ||
capacity_type = "SPOT" | ||
iam_role_permissions_boundary = local.permissions_boundary_arn | ||
} | ||
|
||
eks_managed_node_groups = { | ||
worker_group_1 = { | ||
name = "${var.cluster_name}-wg1" | ||
instance_types = [var.instance_type_wg1] | ||
ami_type = var.use_bottlerocket ? "BOTTLEROCKET_x86_64" : "AL2_x86_64" | ||
platform = var.use_bottlerocket ? "bottlerocket" : "linux" | ||
|
||
# additional_userdata = "echo foo bar" | ||
vpc_security_group_ids = [ | ||
aws_security_group.all_worker_mgmt.id, | ||
aws_security_group.worker_group_all.id, | ||
] | ||
desired_size = var.wg1_size | ||
min_size = 1 | ||
max_size = var.wg1_max_size | ||
|
||
# Disk space can't be set with the default custom launch template | ||
# disk_size = 100 | ||
block_device_mappings = [ | ||
{ | ||
# https://github.com/bottlerocket-os/bottlerocket/discussions/2011 | ||
device_name = var.use_bottlerocket ? "/dev/xvdb" : "/dev/xvda" | ||
ebs = { | ||
# Uses default alias/aws/ebs key | ||
encrypted = true | ||
volume_size = var.root_volume_size | ||
volume_type = "gp3" | ||
} | ||
} | ||
] | ||
|
||
subnet_ids = slice(module.vpc.public_subnets, 0, var.number_azs) | ||
}, | ||
# Add more worker groups here | ||
} | ||
|
||
manage_aws_auth_configmap = true | ||
# Anyone in the AWS account with sufficient permissions can access the cluster | ||
aws_auth_accounts = [ | ||
data.aws_caller_identity.current.account_id, | ||
] | ||
aws_auth_roles = [ | ||
{ | ||
# GitHub OIDC role | ||
rolearn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${var.cluster_name}-${var.github_oidc_role_suffix}" | ||
username = "binderhub-github-oidc" | ||
groups = ["system:masters"] | ||
}, | ||
{ | ||
# GitHub OIDC terraform role | ||
rolearn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${var.cluster_name}-${var.github_oidc_role_suffix}-terraform" | ||
username = "binderhub-github-oidc" | ||
groups = ["system:masters"] | ||
}, | ||
{ | ||
# BinderHub admins role | ||
rolearn = aws_iam_role.eks_access.arn | ||
username = "binderhub-admin" | ||
groups = ["system:masters"] | ||
} | ||
] | ||
} | ||
|
||
data "aws_eks_cluster_auth" "binderhub" { | ||
name = var.cluster_name | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this would need to be removed before merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm using it to test the continuous deployment process on my fork. I'll remove it just before merging (or in a follow-up PR).