This repository contains a CloudFormation template which deploys a serverless event-driven solution that integrates AWS Backup with the Amazon RDS export feature to automate export tasks and enables you to query the data using Amazon Athena without provisioning a new RDS instance or Aurora cluster.
The following diagram illustrates the architecture of the solution.
Let’s go through the steps shown in the diagram above:
- You create a backup plan which will put database backups to the vault created by the technical solution.
- In this solution, we use AWS Backup as a signal source for an EventBridge rule.
- The EventBridge rule triggers an AWS Lambda function which starts export task for the database. This solution uses AWS Key Management Service (AWS KMS) to encrypt the database exports in Amazon S3.
- This solution uses Amazon Simple Storage Service (Amazon S3) to store the database exports.
This solution also provides an option if you don’t need to query data export using Athena. When deploying the CloudFormation template, you can choose to skip the creation of resources for step 5, 6, and 7.
- The EventBridge rule triggers a Lambda function when the export task is completed. It uses Amazon Simple Notification Service (Amazon SNS) to send email if export task fails.
- The Lambda function uses AWS Glue to create a database, crawler and runs it.
- After the crawler successfully runs, you can use Amazon Athena to query the data directly in Amazon S3.
To get started, create the solution resources using a CloudFormation template:
- Download the
templates/automate-rds-aurora-export.yaml
CloudFormation template to create a new stack. - For Stack name, enter a name.
- For KMS Key Configuration, choose if you want a new KMS key to be created as part of this solution. If you already have an existing KMS key that you want to use, choose No.
- If you choose No for KMS key creation, it is mandatory to enter a valid KMS key ID to be used by the solution. You need to configure key users manually after the solution deployed. Leave this field blank if you chose Yes for KMS Key Configuration.
- Under RDS Export Configuration, enter a valid email address to receive notification when an S3 export task failed.
- You can enter schema, database, or table names if you want only specific objects to be exported in comma-separated list. Otherwise, leave this field blank for all database objects to be exported. You can find more details about this parameter in the AWS Boto3 documentation.
- If you choose Yes, the solution will make exports automatically available in Athena.
- Click Next.
- Accept all the defaults and choose Next.
- Acknowledge the creation of AWS Identity and Access Management (IAM) resources and click Submit.
The stack creation starts with the status Create in Progress and takes approximately 5 minutes to complete.
- On the Outputs tab, take note of the following resource names:
BackupVaultName
IamRoleForGlueService
IamRoleForLambdaBackupCompleted
IamRoleForLambdaExportCompleted
SnsTopicName
- If you decided to use an existing KMS key, you need to give the IAM roles you took note of in step 11 access to your existing KMS key. You can do that by using the AWS Management Console default view or policy view.
- Check your email inbox and choose Confirm subscription in the email from Amazon SNS. Amazon SNS opens your web browser and displays a subscription confirmation with your subscription ID.
Now you’re ready to store your all RDS or Aurora database exports on Amazon S3 automatically and make them available on Athena. This solution can work for all RDS or Aurora database backups taken using AWS Backup, which uses the backup vault created by the CloudFormation template.
Before you use this solution, ensure your RDS instance supports exporting snapshots to Amazon S3. There might be cases when tables or rows can be excluded from the export because using incompatible data types. Review the feature limitations for RDS and Aurora, test the data consistency between the source database and the exported data from Athena.
To avoid incurring future charges, delete the resources you created:
- On the AWS Backup console, delete the recovery points.
- On the Amazon S3 console, empty the S3 bucket created by the CloudFormation template to store the RDS database exports.
- On the AWS CloudFormation console, delete the stack that you created for the solution.