This AWS CDK (Cloud Development Kit) project establishes an architecture to manage interactions among Amazon Elastic File System (EFS), AWS Batch, Amazon S3, and API Gateway Lambda functions.
This repository has been based on the project aws-s3-efs-lambda by smislam
.
AWS Batch jobs dealing with multiple tasks (job arrays) and requiring large files typically download each file from S3 based on the task index in parallel. This incurs additional costs and slows down job execution.
By configuring EFS, a shared file system can be mounted with each job (container). This simplifies file access from the container via a local path, enhancing efficiency and reducing overhead in downloading files from S3.
Caution
This is an experimental project. I am not an expert in systems or security. Currently, I'm researching cloud-based 3D rendering with AWS Batch and have decided to make this repository public so other developers can analyze and improve it for their own solutions.
This logic allows us to upload files to S3 and have a lambda function triggered on each upload to copy the file to EFS.
We do this to write content to EFS and subsequently visualize the files in AWS Batch jobs.
This logic enables us to configure all aspects of AWS Batch to submit a job and mount an EFS volume in the job containers.
The container image comes from Docker Hub. It's a public image with a small bash script that lists the local path where the EFS has been mounted.
Index 0 | Index 1 |
---|---|
The main components of this stack include:
-
VPC Setup
- Creating a Virtual Private Cloud (VPC) for isolating resources.
-
Security Groups
- Defining security groups to control inbound and outbound traffic.
-
EFS (Elastic File System) Configuration
- Setting up the EFS file system within the VPC for scalable file storage.
-
Access Point
- Establishing an access point within the EFS to permit specific access to the file system.
-
S3 Bucket
- Establishing an S3 bucket for storing files.
-
Lambda Functions
- Implementing Lambda functions for file operations:
- Copy To EFS Function: Moving files from S3 to EFS.
- List EFS Contents Function: Listing files present in the EFS.
- Implementing Lambda functions for file operations:
-
API Gateway
- Setting up an API Proxy using Lambda to list files present in the EFS.
-
AWS Batch Resources
- Configuring resources for AWS Batch:
- Defining a job queue, compute environment, and job definition.
- Configuring resources for AWS Batch:
Before deploying this CDK stack, ensure you have:
- Appropriate AWS credentials set up.
- AWS CDK installed and configured.
-
Installation
npm install
-
Deployment
cdk deploy
-
Constants: Adjust constant values like
lambdaLocalMountPath
in theCdkEfsBatchS3Stack
file to suit your requirements. -
Customizations: Modify function names, resource names, and other parameters within respective function calls as needed.
Once deployed, the stack provides various functionalities:
- File Transfer: Files from S3 are moved to the configured EFS.
- Listing Files: Access the API endpoint to list files present in the EFS.
To remove the deployed stack and associated resources:
cdk destroy