Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement X-Ray tracing for end-to-end observability #3

Merged
merged 13 commits into from
Aug 29, 2024
100 changes: 74 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,13 @@ This guide provides a quick way to get started with our project. Please see our
cd unity-initiator/terraform-unity/initiator/
```

1. Copy a sample router configuration YAML file to use for deployment and update the AWS region and AWS account ID to match your AWS environment. We will be using the NISAR TLM test case for this demo so we also rename the SNS topic ARN for it accordingly:
1. You will need an S3 bucket for terraform to stage the router Lambda zip file and router configuration YAML file during deployment. Create one or reuse an existing one and set an environment variable for it:

```
export CODE_BUCKET=<some S3 bucket name>
```

1. Copy a sample router configuration YAML file to use for deployment and update the AWS region and AWS account ID to match your AWS environment. We will be using the NISAR TLM test case for this demo so we also rename the SNS topic ARN for it accordingly. We then upload the router configuration file:

```
cp ../../tests/resources/test_router.yaml .
Expand All @@ -239,18 +245,7 @@ This guide provides a quick way to get started with our project. Please see our
sed -i "s/hilo-hawaii-1/${AWS_REGION}/g" test_router.yaml
sed -i "s/123456789012:eval_nisar_ingest/${AWS_ACCOUNT_ID}:uod-dev-eval_nisar_ingest-evaluator_topic/g" test_router.yaml
sed -i "s/123456789012:eval_airs_ingest/${AWS_ACCOUNT_ID}:uod-dev-eval_airs_ingest-evaluator_topic/g" test_router.yaml
```

1. You will need an S3 bucket for terraform to stage the router Lambda zip file during deployment. Create one or reuse an existing one and set an environment variable for it:

```
export CODE_BUCKET=<some S3 bucket name>
```

1. You will need an S3 bucket to store the router configuration YAML file. Create one or reuse an existing one (could be the same one in the previous step) and set an environment variable for it:

```
export CONFIG_BUCKET=<some S3 bucket name>
aws s3 cp test_router.yaml s3://${CODE_BUCKET}/test_router.yaml
```

1. Set a project name:
Expand All @@ -271,25 +266,36 @@ This guide provides a quick way to get started with our project. Please see our
terraform apply \
--var project=${PROJECT} \
--var code_bucket=${CODE_BUCKET} \
--var config_bucket=${CONFIG_BUCKET} \
--var router_config=test_router.yaml \
--var router_config=s3://${CODE_BUCKET}/test_router.yaml \
-auto-approve
```

**Take note of the `initiator_topic_arn` that is output by terraform. It will be used when setting up any triggers.**

#### Deploying an Example Evaluator (SNS topic->SQS queue->Lambda)
#### Deploying Example Evaluators (SNS topic->SQS queue->Lambda)

1. Change directory to the location of the sns_sqs_lambda evaluator terraform:
In this demo we will deploy 2 evaluators:

1. `eval_nisar_ingest` - evaluate ingestion of NISAR telemetry files deposited into the ISL bucket

1. `eval_airs_ingest` - evaluate ingestion of AIRS RetStd files returned by a periodic CMR query

##### Evaluator Deployment for NISAR TLM (via staged data to the ISL)
1. Change directory to the location of the evaluators terraform:
```
cp -rp sns_sqs_lambda sns_sqs_lambda-nisar_tlm
cd ../evaluators
```

1. Make a copy of the `sns_sqs_lambda` directory for the NISAR TLM evaluator:

```
cp -rp sns-sqs-lambda sns-sqs-lambda-nisar-tlm
```

1. Change directory into the NISAR TLM evaluator terraform:

```
cd sns_sqs_lambda-nisar_tlm/
cd sns-sqs-lambda-nisar-tlm/
```

1. Set the name of the evaluator to our NISAR example:
Expand All @@ -301,7 +307,7 @@ This guide provides a quick way to get started with our project. Please see our
1. Note the implementation of the evaluator code. It currently doesn't do any real evaluation but simply returns that evaluation was successful:

```
cat data.tf
cat lambda_handler.py
```

1. Initialize terraform:
Expand All @@ -315,17 +321,59 @@ This guide provides a quick way to get started with our project. Please see our
```
terraform apply \
--var evaluator_name=${EVALUATOR_NAME} \
--var code_bucket=${CODE_BUCKET} \
-auto-approve
```

**Take note of the `evaluator_topic_arn` that is output by terraform. It should match the topic ARN in the test_router.yaml file you used during the initiator deployment. If they match then the router Lambda is now able to submit payloads to this evaluator SNS topic.**

##### Evaluator Deployment for AIRS RetStd (via scheduled CMR query)
1. Change directory to the location of the evaluators terraform:
```
cd ..
```

1. Make a copy of the `sns_sqs_lambda` directory for the AIRS RetStd evaluator:
```
cp -rp sns-sqs-lambda sns-sqs-lambda-airs-retstd
```

1. Change directory into the AIRS RetStd evaluator terraform:
```
cd sns-sqs-lambda-airs-retstd/
```

1. Set the name of the evaluator to our AIRS example:
```
export EVALUATOR_NAME=eval_airs_ingest
```

1. Note the implementation of the evaluator code. It currently doesn't do any real evaluation but simply returns that evaluation was successful:
```
cat lambda_handler.py
```

1. Initialize terraform:
```
terraform init
```

1. Run terraform apply:
```
terraform apply \
--var evaluator_name=${EVALUATOR_NAME} \
--var code_bucket=${CODE_BUCKET} \
-auto-approve
```

**Take note of the `evaluator_topic_arn` that is output by terraform. It should match the respective topic ARN in the test_router.yaml file you used during the initiator deployment. If they match then the router Lambda is now able to submit payloads to this evaluator SNS topic.**

#### Deploying an S3 Event Notification Trigger

1. Change directory to the location of the s3_bucket_notification trigger terraform:
1. Change directory to the location of the s3-bucket-notification trigger terraform:

```
cd ../../triggers/s3_bucket_notification/
cd ../../triggers/s3-bucket-notification/
```

1. You will need an S3 bucket to configure event notification on. Create one or reuse an existing one (could be the same one in the previous steps) and set an environment variable for it:
Expand Down Expand Up @@ -382,10 +430,10 @@ This guide provides a quick way to get started with our project. Please see our

#### Deploying an EventBridge Scheduler Trigger

1. Change directory to the location of the s3_bucket_notification trigger terraform:
1. Change directory to the location of the scheduled-task trigger terraform:

```
cd ../scheduled_task/
cd ../scheduled-task/
```

1. Note the implementation of the trigger lambda code. It currently hard codes a payload URL however in a real implementation, code would be written to query for new files from some REST API, database, etc. Here we simulate that and simply return a NISAR TLM file:
Expand Down Expand Up @@ -416,10 +464,10 @@ This guide provides a quick way to get started with our project. Please see our

#### Deploying an EventBridge Scheduler Trigger for Periodic CMR Queries

1. Change directory to the location of the s3_bucket_notification trigger terraform:
1. Change directory to the location of the cmr-query trigger terraform:

```
cd ../cmr_query/
cd ../cmr-query/
```

1. Note the implementation of the trigger lambda code. It will query CMR for granules for a particular collection within a timeframe, query its dynamodb table if they already exist, and if not, submit them as payload URLs to the initiator SNS topic and save them into the dynamodb table:
Expand Down
3 changes: 2 additions & 1 deletion scripts/build_cmr_query_lambda_package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ BASE_PATH=$(dirname "${BASH_SOURCE}")
BASE_PATH=$(cd "${BASE_PATH}/.."; pwd)
DIST_DIR=${BASE_PATH}/dist
PKG_DIR=${DIST_DIR}/lambda_packages
CMR_QUERY_DIR=${BASE_PATH}/terraform-unity/triggers/cmr_query
CMR_QUERY_DIR=${BASE_PATH}/terraform-unity/triggers/cmr-query

set -ex

Expand All @@ -15,6 +15,7 @@ VERSION=$(hatch run python -c 'from importlib.metadata import version; print(ver
echo "{\"version\": \"$VERSION\"}" > ${DIST_DIR}/version.json
mkdir -p $PKG_DIR
pip install -t $PKG_DIR ${DIST_DIR}/unity_initiator-*.whl
pip install -t $PKG_DIR aws_xray_sdk
pip install -t $PKG_DIR python_cmr
cp ${CMR_QUERY_DIR}/lambda_handler.py $PKG_DIR/
cd $PKG_DIR
Expand Down
21 changes: 21 additions & 0 deletions scripts/build_scheduled_task_lambda_package.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash
BASE_PATH=$(dirname "${BASH_SOURCE}")
BASE_PATH=$(cd "${BASE_PATH}/.."; pwd)
DIST_DIR=${BASE_PATH}/dist
PKG_DIR=${DIST_DIR}/lambda_packages
SCHED_TASK_DIR=${BASE_PATH}/terraform-unity/triggers/scheduled-task-instrumented

set -ex

rm -rf $DIST_DIR
pip install hatch
hatch clean
hatch build
VERSION=$(hatch run python -c 'from importlib.metadata import version; print(version("unity_initiator"))')
echo "{\"version\": \"$VERSION\"}" > ${DIST_DIR}/version.json
mkdir -p $PKG_DIR
pip install -t $PKG_DIR ${DIST_DIR}/unity_initiator-*.whl
pip install -t $PKG_DIR aws_xray_sdk
cp ${SCHED_TASK_DIR}/lambda_handler.py $PKG_DIR/
cd $PKG_DIR
zip -rq ${DIST_DIR}/scheduled_task-${VERSION}-lambda.zip .
7 changes: 7 additions & 0 deletions src/unity_initiator/cloud/lambda_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,19 @@
from tempfile import mkstemp

import smart_open
from aws_xray_sdk.core import patch_all, xray_recorder

from ..router import Router
from ..utils.logger import logger

# initialize the AWS X-Ray SDK
patch_all()


ROUTER = None


@xray_recorder.capture("lambda_handler_base")
def lambda_handler_base(event, context):
"""Base lambda handler that instantiates a router, globally, and executes actions for a single payload."""

Expand All @@ -35,6 +41,7 @@ def lambda_handler_base(event, context):
f.write(router_cfg)
ROUTER = Router(router_file)
os.unlink(router_file)
xray_recorder.put_annotation("payload", event["payload"])
return ROUTER.execute_actions(event["payload"])


Expand Down
9 changes: 7 additions & 2 deletions terraform-unity/evaluators/sns-sqs-lambda/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@

| Name | Version |
|------|---------|
| <a name="provider_archive"></a> [archive](#provider\_archive) | 2.4.2 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.51.1 |
| <a name="provider_local"></a> [local](#provider\_local) | 2.5.1 |
| <a name="provider_null"></a> [null](#provider\_null) | 3.2.2 |

## Modules

Expand All @@ -29,25 +30,29 @@ No modules.
| [aws_cloudwatch_log_group.evaluator_lambda_log_group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group) | resource |
| [aws_iam_policy.evaluator_lambda_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_role.evaluator_lambda_iam_role](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role) | resource |
| [aws_iam_role_policy_attachment.aws_xray_write_only_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [aws_iam_role_policy_attachment.lambda_base_policy_attachment](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [aws_iam_role_policy_attachment.lambda_policy_attachment](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [aws_lambda_event_source_mapping.evaluator_queue_event_source_mapping](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_event_source_mapping) | resource |
| [aws_lambda_function.evaluator_lambda](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_function) | resource |
| [aws_s3_object.lambda_package](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_object) | resource |
| [aws_sns_topic.evaluator_topic](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sns_topic) | resource |
| [aws_sns_topic_policy.evaluator_topic_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sns_topic_policy) | resource |
| [aws_sns_topic_subscription.evaluator_subscription](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sns_topic_subscription) | resource |
| [aws_sqs_queue.evaluator_dead_letter_queue](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sqs_queue) | resource |
| [aws_sqs_queue.evaluator_queue](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sqs_queue) | resource |
| [aws_sqs_queue_policy.evaluator_queue_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sqs_queue_policy) | resource |
| [aws_ssm_parameter.evaluator_lambda_function_name](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ssm_parameter) | resource |
| [archive_file.evaluator_lambda_artifact](https://registry.terraform.io/providers/hashicorp/archive/latest/docs/data-sources/file) | data source |
| [null_resource.build_lambda_package](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource |
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
| [aws_iam_policy.mcp_operator_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy) | data source |
| [local_file.version](https://registry.terraform.io/providers/hashicorp/local/latest/docs/data-sources/file) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_code_bucket"></a> [code\_bucket](#input\_code\_bucket) | The S3 bucket where lambda zip files will be stored and accessed | `string` | n/a | yes |
| <a name="input_evaluator_name"></a> [evaluator\_name](#input\_evaluator\_name) | The evaluator name | `string` | n/a | yes |
| <a name="input_project"></a> [project](#input\_project) | The unity project its installed into | `string` | `"uod"` | no |
| <a name="input_venue"></a> [venue](#input\_venue) | The unity venue its installed into | `string` | `"dev"` | no |
Expand Down
23 changes: 23 additions & 0 deletions terraform-unity/evaluators/sns-sqs-lambda/build_lambda_package.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash
BASE_PATH=$(dirname "${BASH_SOURCE}")
BASE_PATH=$(cd "${BASE_PATH}/../../.."; pwd)
DIST_DIR=${BASE_PATH}/dist
PKG_DIR=${DIST_DIR}/lambda_packages
EVALUATOR_DIR=$(dirname "${BASH_SOURCE}")
EVALUATOR_DIR=$(cd "${EVALUATOR_DIR}"; pwd)
EVALUATOR_NAME=$1

set -ex

rm -rf $DIST_DIR
pip install hatch
hatch clean
hatch build
VERSION=$(hatch run python -c 'from importlib.metadata import version; print(version("unity_initiator"))')
echo "{\"version\": \"$VERSION\"}" > ${DIST_DIR}/version.json
mkdir -p $PKG_DIR
pip install -t $PKG_DIR ${DIST_DIR}/unity_initiator-*.whl
pip install -t $PKG_DIR aws_xray_sdk
cp ${EVALUATOR_DIR}/lambda_handler.py $PKG_DIR/
cd $PKG_DIR
zip -rq ${DIST_DIR}/${EVALUATOR_NAME}-${VERSION}-lambda.zip .
20 changes: 3 additions & 17 deletions terraform-unity/evaluators/sns-sqs-lambda/data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,7 @@ data "aws_iam_policy" "mcp_operator_policy" {
name = "mcp-tenantOperator-AMI-APIG"
}

data "archive_file" "evaluator_lambda_artifact" {
type = "zip"
output_path = "${path.root}/.archive_files/${var.evaluator_name}-evaluator_lambda.zip"

source {
filename = "lambda_function.py"
content = <<CODE
def lambda_handler(event, context):
print(f"event: {event}")
print(f"context: {context}")

# implement your adaptation-specific evaluator code here and return
# True if it successfully evaluates. False otherwise.

return { "success": True }
CODE
}
data "local_file" "version" {
filename = "${path.module}/../../../dist/version.json"
depends_on = [null_resource.build_lambda_package]
}
22 changes: 22 additions & 0 deletions terraform-unity/evaluators/sns-sqs-lambda/lambda_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import json

from aws_xray_sdk.core import patch_all, xray_recorder

from unity_initiator.utils.logger import logger

patch_all()


def perform_evaluation(event, context):
logger.info("event: %s", json.dumps(event, indent=2))
logger.info("context: %s", context)

# Implement your adaptation-specific evaluator code here and return
# True if it successfully evaluates. False otherwise.

return True


def lambda_handler(event, context):
with xray_recorder.capture(context.function_name):
return {"success": perform_evaluation(event, context)}
Loading