Skip to content

Krishnaa548/Mlops

Repository files navigation

MLOps Project: End-to-End Machine Learning Operations Pipeline

A scalable MLOps framework for automating, deploying, and monitoring machine learning workflows in production.

📖 Overview

This repository provides an end-to-end MLOps pipeline to streamline the machine learning lifecycle—from data ingestion and model training to deployment and monitoring. Designed for data scientists and engineers, it emphasizes reproducibility, scalability, and collaboration by integrating DevOps practices into ML workflows.

Key Features:

  • Automated ML Pipelines: Preprocessing, training, evaluation, and deployment orchestrated via workflow managers.
  • CI/CD Integration: Automated testing and deployment using GitHub Actions/Jenkins.
  • Model & Data Versioning: Track experiments, datasets, and models with DVC/MLflow.
  • Production Monitoring: Real-time performance metrics and drift detection.
  • Explainability: SHAP/LIME integration for model interpretability.
  • Cloud-Ready: Deploy on Kubernetes, AWS, or GCP with Docker containers.

🛠️ Tech Stack

  • ML Frameworks: TensorFlow, PyTorch, Scikit-learn
  • Workflow Orchestration: Apache Airflow, Kubeflow
  • Versioning: DVC, MLflow, Git
  • CI/CD: GitHub Actions, Jenkins
  • Monitoring: Prometheus, Grafana, ELK Stack
  • Deployment: Docker, Kubernetes, AWS SageMaker
  • Explainability: SHAP, Lime

🚀 Getting Started

Prerequisites

  • Python 3.8+
  • Docker
  • Git

Installation

  1. Clone the repository:
    git clone https://github.com/Krishnaa548/dsmlops_ownexp.git
    cd mlops-project
    
  2. Set up a virtual environment:

bash Copy python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows

Install dependencies:

bash Copy

pip install -r requirements.txt Configure environment variables:

bash Copy

cp .env.example .env

Update .env with your AWS/GCP credentials, API keys, etc.

⚙️ Configuration Modify the configs/pipeline.yaml to define:

Data paths (local/cloud storage)

Model hyperparameters

Training/evaluation thresholds

Deployment targets (e.g., Kubernetes cluster)

Example:

yaml Copy data:

raw_path: "data/raw" processed_path: "data/processed"

model: framework: "tensorflow" hyperparameters: epochs: 50 batch_size: 32

🧪 Usage TO Run the Pipeline Locally bash Copy

python src/pipeline.py --config configs/pipeline.yaml

Steps: Data Processing:

bash Copy

python src/run.py --stage preprocess

Model Training:

bash Copy

python src/run.py --stage train Evaluation & Deployment:

bash Copy

python src/run.py --stage deploy

CI/CD Integration

GitHub Actions triggers automated testing on git push.

Merge to main deploys the model to staging via Kubernetes.

📦 Model Deployment

Docker Example: dockerfile

Copy

FROM tensorflow/tensorflow:2.9.1 COPY . /app WORKDIR /app CMD ["python", "src/serve.py"] Build and deploy:

bash Copy

docker build -t mlops-model:latest . docker run -p 5000:5000 mlops-model 📊 Monitoring & Logging Access Grafana dashboard at http://localhost:3000 to view model metrics.

Logs are stored in logs/pipeline.log and forwarded to Elasticsearch.

📜 License Distributed under the MIT License. See LICENSE for details.

🙏 Acknowledgments Inspired by MLflow and Kubeflow.

Data versioning powered by DVC.

About

My mlops projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published