GitHub - bilalltf/kerdro-mlflow

MLOps projet unsing Kedro and MLFlow

Kedro: is an open-source Python framework for creating reproducible, maintainable and modular data science code. It is defined by a structure of nodes and pipelines. Nodes are the functions that perform any operations on the data. A set of nodes executed in a sequence is called a pipeline. The most common pipelines are data engineering and data science pipelines.

MLFlow: is an open-source platform for managing the end-to-end machine learning lifecycle. It provides a central place to track experiments, compare results, and share models.

Used dataset YELLOW TRIPDATA 2018-12 from nyc.gov website.

After downloading the dataset, save it in the data folder:

├── data
│   ├── 01_raw
│   │   └── yellow_tripdata_2018-12.csv

Setup kedro env:

conda create --name kedro_env python=3.7
conda activate kedro_env
pip install kedro

Install kedro-viz:

pip install kedro-viz

Install kedro-mlflow:

pip install kedro-mlflow

Install requirements:

pip install -r src/requirements.txt

For now everything is installed and ready to go. Let's start by creating a new kedro project and run it:

Create a new project:

kedro new

Run the project:

kedro run

Run the project with a specific pipeline:

kedro run --pipeline data_engineering

Run the project with a specific node:

kedro run --node create_report

To visualize the project pipelines, run the following command:

Run kedro-viz:

kedro viz

To track the project metrics with MLFlow, run the following command:

Run kedro-mlflow:

kedro mlflow init
kedro mlflow run

Run kedro-mlflow with a specific pipeline:

kedro mlflow run --pipeline data_engineering

Data engineering pipeline:

Data science pipeline:

The tracking UI:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
conf		conf
data		data
docs/source		docs/source
logs		logs
mlruns		mlruns
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
kedro-de-pipeline.png		kedro-de-pipeline.png
kedro-ds-pipeline.png		kedro-ds-pipeline.png
mlflow-metrics-tracking.png		mlflow-metrics-tracking.png
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps projet unsing Kedro and MLFlow

About

Releases

Packages

Languages

bilalltf/kerdro-mlflow

Folders and files

Latest commit

History

Repository files navigation

MLOps projet unsing Kedro and MLFlow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages