Original documentation can be found here - https://aws.amazon.com/blogs/big-data/building-a-serverless-data-quality-and-analysis-framework-with-deequ-and-aws-glue/
- Setup AWS CLI locally (temporary AWS credentials for the account can be used)
- Install Python >= 3.7 (check this https://linuxize.com/post/how-to-install-python-3-7-on-ubuntu-18-04/)
- Install Node.js >= 14.7.0
curl -fsSL https://deb.nodesource.com/setup_current.x | sudo -E bash -
sudo apt-get install -y nodejs
- Install needed dependencies
cd backend
make install
mkdir ~/.npm-global
PATH=~/.npm-global/bin:$PATH
NPM_CONFIG_PREFIX=~/.npm-global
sudo npm install -g serverless serverless-pseudo-parameters serverless-python-requirements serverless-wsgi --unsafe
- To deploy simply run:
cd ../
./deploy.sh -r $YOUR_REGION_HERE -p $YOUR_AMAZON_PROFILE -n $YOUR_STACK_NAME -e $YOUR_ENV_NAME
All these parameters are optional, default values you can find in deploy.sh script. $YOUR_REGION_HERE - AWS region where you want to deploy $YOUR_AMAZON_PROFILE - profile you select to use (set during aws configure) $YOUR_STACK_NAME - how your Stack will be displayed in AWS Cloudformation $YOUR_ENV_NAME - environment to deploy resources to (dev\uat\prod)
- To test the deployment in E2E manner, please use manual from here (Text and links under architecture picture): https://aws.amazon.com/blogs/big-data/building-a-serverless-data-quality-and-analysis-framework-with-deequ-and-aws-glue/
!!! TESTING OF THE SOLUTION WILL COST YOUR MONEY ( about 10 US cents :) ) !!!
- Build and push docker container with pushDockerfile.groovy
- Create a new Jenkins item as a pipeline and use Jenkinsfile to configure the job.
- Run Jenkins job with parameters needed.