A scalable MLOps framework for automating, deploying, and monitoring machine learning workflows in production.
This repository provides an end-to-end MLOps pipeline to streamline the machine learning lifecycle—from data ingestion and model training to deployment and monitoring. Designed for data scientists and engineers, it emphasizes reproducibility, scalability, and collaboration by integrating DevOps practices into ML workflows.
- Automated ML Pipelines: Preprocessing, training, evaluation, and deployment orchestrated via workflow managers.
- CI/CD Integration: Automated testing and deployment using GitHub Actions/Jenkins.
- Model & Data Versioning: Track experiments, datasets, and models with DVC/MLflow.
- Production Monitoring: Real-time performance metrics and drift detection.
- Explainability: SHAP/LIME integration for model interpretability.
- Cloud-Ready: Deploy on Kubernetes, AWS, or GCP with Docker containers.
- ML Frameworks: TensorFlow, PyTorch, Scikit-learn
- Workflow Orchestration: Apache Airflow, Kubeflow
- Versioning: DVC, MLflow, Git
- CI/CD: GitHub Actions, Jenkins
- Monitoring: Prometheus, Grafana, ELK Stack
- Deployment: Docker, Kubernetes, AWS SageMaker
- Explainability: SHAP, Lime
- Python 3.8+
- Docker
- Git
- Clone the repository:
git clone https://github.com/Krishnaa548/dsmlops_ownexp.git cd mlops-project
- Set up a virtual environment:
bash Copy python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
Install dependencies:
bash Copy
pip install -r requirements.txt Configure environment variables:
bash Copy
cp .env.example .env
⚙️ Configuration Modify the configs/pipeline.yaml to define:
Data paths (local/cloud storage)
Model hyperparameters
Training/evaluation thresholds
Deployment targets (e.g., Kubernetes cluster)
Example:
yaml Copy data:
raw_path: "data/raw" processed_path: "data/processed"
model: framework: "tensorflow" hyperparameters: epochs: 50 batch_size: 32
🧪 Usage TO Run the Pipeline Locally bash Copy
python src/pipeline.py --config configs/pipeline.yaml
Steps: Data Processing:
bash Copy
python src/run.py --stage preprocess
Model Training:
bash Copy
python src/run.py --stage train Evaluation & Deployment:
bash Copy
python src/run.py --stage deploy
CI/CD Integration
GitHub Actions triggers automated testing on git push.
Merge to main deploys the model to staging via Kubernetes.
📦 Model Deployment
Docker Example: dockerfile
Copy
FROM tensorflow/tensorflow:2.9.1 COPY . /app WORKDIR /app CMD ["python", "src/serve.py"] Build and deploy:
bash Copy
docker build -t mlops-model:latest . docker run -p 5000:5000 mlops-model 📊 Monitoring & Logging Access Grafana dashboard at http://localhost:3000 to view model metrics.
Logs are stored in logs/pipeline.log and forwarded to Elasticsearch.
📜 License Distributed under the MIT License. See LICENSE for details.
🙏 Acknowledgments Inspired by MLflow and Kubeflow.
Data versioning powered by DVC.