This project implements an ETL (Extract, Transform, Load) pipeline using Python, Pandas, and MySQL to process retail transaction data. Additionally, a FastAPI application provides an API to retrieve customer summaries and product sales insights.
- ETL Pipeline: Cleans and loads data from CSV into MySQL.
- MySQL Database: Stores structured transaction data.
- FastAPI Application: Provides RESTful APIs for querying data.
Ensure the following are installed:
- Python 3.12.3
- MySQL Server
- MySQL Workbench
- Required Python dependencies
- Make sure the dataset is in CSV format for better results
Run the following command to install required packages:
pip install pandas mysql-connector-python fastapi uvicorn
Download and install MySQL from MySQL Official Site.
Create a database using MySQL Workbench or the command line:
CREATE DATABASE yourdatabase;
USE yourdatabase;
Note: The ETL script will automatically create the required database and tables.
Start the FastAPI app:
uvicorn main:app --reload
Note: Ensure the ETL script and FastAPI app run in the same environment.
- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://127.0.0.1:8000/redoc
Modify the credentials in your script as needed:
DB_HOST = "e.g:123.0.0.1"
DB_USER = "your_username"
DB_PASSWORD = "your_password"
DB_NAME = "your_database_name"
- Endpoint:
GET /customer_summary
- Query Parameter:
customer_id
(optional, int) - Example Request:
GET http://127.0.0.1:8000/customer_summary?customer_id=12345
- Response Example:
[ { "CustomerID": 12345, "TotalSpent": 500.75, "TotalOrders": 15, "LastPurchaseDate": "2025-01-15 14:30:00" } ]
- Endpoint:
GET /product_sales
- Query Parameter:
product_code
(optional, string) - Example Request:
GET http://127.0.0.1:8000/product_sales?product_code=85123A
- Response Example:
[ { "StockCode": "85123", "TotalSold": 150, "TotalRevenue": 1200.50, "LastSaleDate": "2025-02-18 10:00:00" } ]
GET http://127.0.0.1:8000/customer_summary
GET http://127.0.0.1:8000/customer_summary?customer_id=12345
GET http://127.0.0.1:8000/product_sales
GET http://127.0.0.1:8000/product_sales?product_code=85123
This project provides an efficient ETL pipeline and a RESTful API for analyzing customer and product sales data. 🚀