This repository contains an ELT (Extract, Load, Transform) pipeline built using Apache Airflow. The pipeline is designed to move data from a source (S3) to a Redshift data warehouse, performing transformations along the way.
- Python
- Apache Airflow
- SQL
- AWS Redshift
- Python 3.x
- Apache Airflow
- AWS Redshift
-
Clone the repository:
git clone https://github.com/vgonzenbach/airflow_elt.git
-
Navigate to the project directory:
cd airflow_elt
-
Install the required packages:
pip install -r requirements.txt
-
Update the
airflow.cfg
with your specific configurations and rename it toairflow.cfg
. -
Run the setup scripts to initialize the database:
./setup/init_db.sh
-
Start the Airflow web server:
./start-airflow.sh
-
Open the Airflow UI and trigger the DAG.
Feel free to fork the project and submit a pull request with your changes!
This project is licensed under the MIT License.