This is a sample solution for logging instance details in response to EC2 Spot Instance Interruption Warnings and analyzing them with CloudWatch Log Insights.
Using CloudWatch Events, an event rule subscribes to EC2 Spot Instance Interruption Warnings, and triggers a Lambda function which collects details about the instance being interrupted and logs that information to a CloudWatch Logs Group. The solution can be configured to log any details you require about your instance, and by default logs InstanceId, InstanceType and all Tags. This information can then be used to develop visualizions with CloudWatch Logs Insights.
- A CloudWatch Event Rule subscribes to EC2 Spot Instance Interruption Warning Events.
{
"detail-type": [
"EC2 Spot Instance Interruption Warning"
],
"source": [
"aws.ec2"
]
}
- The CloudWatch Event Rule triggers a Lambda function in response to these events.
- The Lambda Function retrieves the instance ID from the Lambda Event Context (data sent to the Lambda function by CloudWatch Events).
- The Lambda Function makes a DescribeInstances call EC2 to retrieve data about the EC2 Instance that is schedule for interruption.
- The Lambda Function creates a CloudWatch Log Stream within a CloudWatch Log Group and Logs instance details to the CloudWatch Log Group (named via the CloudWatchLogGroupName parameter of the CloudFormation template).
{
"Event": {
"version": "0",
"id": "ce5fd17f-ef3c-6f86-b99a-35e8d883b1d2",
"detail-type": "EC2 Spot Instance Interruption Warning",
"source": "aws.ec2",
"account": "[REDACTED]",
"time": "2019-07-10T13:57:04Z",
"region": "us-west-2",
"resources": [
"arn:aws:ec2:us-west-2c:instance/i-0838c23d20458e79c"
],
"detail": {
"instance-id": "i-0838c23d20458e79c",
"instance-action": "terminate"
}
},
"InstanceDetails": {
"InstanceId": "i-0838c23d20458e79c",
"InstanceType": "c5.large",
"Placement": {
"AvailabilityZone": "us-west-2b",
"GroupName": "",
"Tenancy": "default"
},
"Tags": [
{
"Key": "aws:ec2:fleet-id",
"Value": "fleet-56e9a4cb-555f-ea91-a438-8b087e2181dd"
},
{
"Key": "aws:ec2launchtemplate:version",
"Value": "1"
},
{
"Key": "aws:autoscaling:groupName",
"Value": "SpotASG"
},
{
"Key": "aws:ec2launchtemplate:id",
"Value": "lt-0e85a5bd97a6d37ab"
}
]
}
}
- Logs can then be reviewed and/or further processed for any required analysis. See the Analyzing Logs section below for some recommendations using CloudWatch Logs Insights.
Search for ec2-spot-interruption-logging-insights in the Serverless Application Repository and follow the instructions to deploy. (Make sure you've checked the box labeled: Show apps that create custom IAM roles or resource policies)
Note: For easiest deployment, create a Cloud9 instance and use the provided environment to deploy the function.
- AWS CLI already configured with Administrator permission
- Python 3 installed
- Docker installed
- SAM CLI installed
Once you've installed the requirements listed above, open a terminal sesssion as you'll need to run through a few commands to deploy the solution.
Firstly, we need a S3 bucket
where we can upload our Lambda functions packaged as ZIP before we deploy anything - If you don't have a S3 bucket to store code artifacts then this is a good time to create one:
aws s3 mb s3://BUCKET_NAME
Next, clone the ec2-spot-labs repository to your local workstation or to your Cloud9 environment.
git clone https://github.com/awslabs/ec2-spot-labs.git
Next, change directories to the root directory for this example solution.
cd ec2-spot-labs/ec2-spot-interruption-logging-insights
Next, run the folllowing command to build the Lambda function:
sam build --use-container
Next, run the following command to package our Lambda function to S3:
sam package \
--output-template-file packaged.yaml \
--s3-bucket REPLACE_THIS_WITH_YOUR_S3_BUCKET_NAME
Next, the following command will create a Cloudformation Stack and deploy your SAM resources.
sam deploy \
--template-file packaged.yaml \
--stack-name ec2-spot-interruption-logging-insights \
--capabilities CAPABILITY_IAM \
--parameter-overrides \
CloudWatchLogGroupName=REPLACE_THIS_WITH_THE_NAME_YOU_WANT \
CloudWatchLogGroupRetentionPeriodDays=REPLACE_THIS_WITH_NUMBER_OF_DAYS_TO_RETAIN_LOGS
- Logon to the AWS Console and Navigate to CloudWatch Console.
- Click on the 'Logs/Insights' Link in the Left Menu Bar.
- Enter your Log Group Name in the 'Select a log group' search bar.
- Select a time period (to the right of the search bar).
- Enter the following Query and then click 'Run query'.
fields @timestamp, @message
| sort @timestamp desc
- You should now see a list of interruptions over the time span you've selected. You can expand each entry to see more details about the event (such as InstanceType, AvailabilityZone, and Instance Tags)
In addition to querying your log data, you can use CloudWatch Logs Insights to build dashboard visualizations.
- Logon to the AWS Console and Navigate to CloudWatch Console.
- Click on the 'Logs/Insights' Link in the Left Menu Bar.
- Enter your Log Group Name in the 'Select a log group' search bar.
- Select a time period (to the right of the search bar).
- Enter the following Query and then click 'Run query'
stats count(*) by InstanceDetails.InstanceType
- Click on the 'Visualization' tab.
- Change the graph type to 'Bar'.
- You should now see a visualization that shows count of interruptions by instance type for the time period you selected.
- To add this to a CloudWatch Dashboard, click on the 'Actions' Button and select 'Add to Dashboard'.
- If you have an existing dashboard you select it or create a new dashboard.
- Give your new widget a name such as 'Interruptions By Instance Type' and click 'Add to dashboard'.
-
You'll be redirected to the Dashboard once it's created. From here you can adjust the time period using the selectors in the top right corner of the window.
-
For example: Here's a dashboard that shows the count of interruptions by instance type for the past 3 hours.
- Save your dashboard to commit any changes.
You can build additional dashboard widgets and further analyze your log data using CloudWatch Logs Insights' powerful query engine. You can learn more about CloudWatch Log Insights here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html