This repository presents a complete SQL-based Data Analytics Pipeline, progressing from Data Warehousing to Exploratory Data Analysis (EDA) and finally to Advanced Analytics. The goal is to create a structured, efficient, and insightful SQL-driven analytical workflow.
🟢 Objective: Build a Data Warehouse using SQL Server, implementing ETL (Extract, Transform, Load) processes.
🛠 Approach: Leverages the Medallion Architecture (Bronze, Silver, and Gold layers) to store and transform raw data into business-ready insights.
- 🏛 Data Architecture: Designed using Star Schema with fact and dimension tables.
- ⚙ ETL Pipelines: Batch processing strategies for data ingestion and transformation.
- 📊 Final Output: Clean, structured data stored in the Gold Layer for analytics.
🔗 Reference: Data Warehouse Project
🟢 Objective: Uncover insights, trends, and anomalies in the dataset using SQL queries.
🛠 Approach: Uses the Gold Layer from the DWH to perform dimension and measure analysis.
- 🏷 Dimension Analysis: Understanding segmentation (e.g., customer demographics, product categories).
- 📊 Measure Exploration: Computing key metrics (e.g., revenue, total sales, average price).
- 📈 Ranking & Trend Analysis: Identifying top/bottom-performing entities using SQL functions.
🔗 Reference: EDA Project
🟢 Objective: Perform complex analytical operations to extract deeper business insights.
🛠 Approach: Uses advanced SQL techniques, including trend analysis, cumulative metrics, segmentation, and performance evaluation.
- ⏳ Time-Series Analysis: Identifying changes over time using
GROUP BY
,DATETRUNC
, andLAG
. - 📊 Cumulative Metrics: Running totals, moving averages, and YoY comparisons.
- 🏆 Performance Analysis: Ranking, category contribution analysis, and part-to-whole evaluations.
- 🔍 Segmentation & Reporting: Customer segmentation with
CASE WHEN
, product performance evaluation.
🔗 Reference: Advanced Analytics Project
- 🗄 SQL Server: Data processing & querying.
- 📂 CSV Datasets: Source files for ETL processes.
- 📊 SSMS: SQL Server Management Studio for database interaction.
- 🖼 DrawIO: Data architecture visualization.
- 🐙 Git & GitHub: Version control & collaboration.
This project is licensed under the MIT License.