Skip to content

Latest commit

 

History

History
171 lines (134 loc) · 7.39 KB

sps-airflow-deployment-with-terraform.md

File metadata and controls

171 lines (134 loc) · 7.39 KB
description
Documentation for deploying an Airflow-based U-SPS on MCP using Terraform

🚀 SPS Airflow Deployment with Terraform

Prerequisites

  • Access to an MCP account (aka a 'venue')
  • An SPS EKS cluster deployed in the same MCP account you would like SPS Airflow deployed into. To do this, following the instructions in the docs.
  • A customized SPS Airflow image with SPS DAGs baked into it. To build this image, following the instructions in the docs.
  • The following tools installed on the personal laptop:

Dependencies from the other Unity Service Areas

A successful deployment of SPS depends on the following items from other the other Unity Service Areas.

Service AreaDescriptionAWS ServiceNaming Convention
Common Services (CS)An SSM parameter containing public and private subnet lists for the VPC that the SPS EKS cluster is deployed in.AWS Systems Manager - SSM paramter./unity/account/network/subnet_list

Setup Instructions

Clone the SPS repository

git clone https://github.com/unity-sds/unity-sps.git

Configure Python environment

  • This Python environment will be used for executing tests.

  • From the root of the repository, create a Python virtualenv:

    python -m virtualenv venv
  • Install the required Python dependencies included in the unity-sps repo:

    source venv/bin/activate
    pip install -e ".[develop, test]"
  • Create a .env file for sensitive values used in the tests:

    touch .env
  • The .env should contain the following:

    # The password you would like to use for accessing Airflow,
    # this should match the value specified in your tfvars used to deploy Airflow
    AIRFLOW_WEBSERVER_PASSWORD=

Configure the Terraform Workspace and prepare a tfvars File

  1. If using tfenv, create a .terraform-version file at the root of the repo and insert the required Terraform version that is specified in versions.tf.

  2. Ensure the Terraform version you are using is equal to the value specified in versions.tf:

    terraform --version
  3. Auto-generate a tfvars template file using terraform-docs. After the auto-generated tfvars template files are created, certain values will need to be specified manually. The values which will need to be specified manually are tagged with comments in the example auto-generated tfvars provided below:

    • From the root of the repository, execute the following commands:

      venue=INSERT-VENUE # It’s just recommended to have this value match the counter included in the EKS cluster name. This will help you/others identify resources that are a part of the same SPS system.
      developer=INSERT-JPL-USERNAME # It’s just recommended to have this value match the counter included in the EKS cluster name. This will help you/others identify resources that are a part of the same SPS system.
      counter=INSERT-COUNTER # It’s just recommended to have this value match the counter included in the EKS cluster name. This will help you/others identify resources that are a part of the same SPS system.
      tfvars_filename=unity-${venue}-sps-${developer}-${counter}.tfvars
      
      cd terraform-unity
      mkdir tfvars
      terraform-docs tfvars hcl . --output-file "tfvars/${tfvars_filename}"
    • Manually override the following values in the auto-generated tfvars file:

    • Note: The BEGIN_TF_DOCS and END_TF_DOCS tags will need to be removed from the tfvars file.

      <!-- BEGIN_TF_DOCS -->
      airflow_webserver_password = "" # The password you would like to use for accessing Airflow
      counter                    = "" # It’s recommended to have this value match the counter included in the EKS cluster name. This will help you/others identify resources that are a part of the same SPS system.
      custom_airflow_docker_image = {
        "name": "ghcr.io/unity-sds/unity-sps/sps-airflow",
        "tag": "develop" # Set this to the value you used when you built a custom SPS Airflow image
      }
      eks_cluster_name = "" # The EKS cluster which you are deploying Airflow into
      helm_charts = {
        "airflow": {
          "chart": "airflow",
          "repository": "https://airflow.apache.org",
          "version": "1.11.0"
        },
        "keda": {
          "chart": "keda",
          "repository": "https://kedacore.github.io/charts",
          "version": "v2.13.1"
        }
      }
      kubeconfig_filepath = "../k8s/kubernetes.yml" # The path to the kubeconfig which corresponds to the EKS cluster which you are deploying Airflow into 
      project             = "unity"
      release             = "" # The current release/sprint you are deploying for, e.g. 24.1
      service_area        = "sps"
      venue               = "" # The MCP venue which you are deploying into. It’s recommended to have this value match the counter included in the EKS cluster name. This will help you/others identify resources that are a part of the same SPS system.
      <!-- END_TF_DOCS -->

Resource provisioning with Terraform

  1. Run a Terraform init:

    cd terraform-unity
    terraform init
  2. Run a Terraform plan:

    terraform plan -var-file=tfvars/${tfvars_filename}
  3. If you are satisfied with the output of the Terraform plan, run a Terraform apply to provision the cluster:

    terraform apply -var-file=tfvars/${tfvars_filename}
  4. If you are done using the resources deployed by the terraform apply, destroy the resources using the terraform destroy command:

    terraform destroy -var-file=tfvars/${tfvars_filename}

Retrieve the endpoint for the Airflow UI

terraform output
load_balancer_hostnames = {
  "airflow" = "http://k8s-airflow-airflowi-52301ddb8d-1489339230.us-west-2.elb.amazonaws.com:5000"
}

Smoke test the SPS Airflow deployment

From the root of the repository, execute the following commands:

source venv
cd unity-test
AIRFLOW_ENDPOINT=http://k8s-airflow-airflowi-52301ddb8d-1489339230.us-west-2.elb.amazonaws.com:5000
pytest -s -vv --gherkin-terminal-reporter step_defs/test_airflow_api_health.py --airflow-endpoint $AIRFLOW_ENDPOINT

Expected output

Feature: Airflow API health check
    Scenario: Check API health
        Given the Airflow API is up and running
        When I send a GET request to the health endpoint
        Then I receive a response with status code 200
        And each Airflow component is reported as healthy
        And each Airflow component's last heartbeat was received less than 30 seconds ago
    PASSED
=============================================== 1 passed in 0.37s ===============================================