Jupyter Notebook Step by Step Guide

Jupyter notebook walkthrough

You can find a Jupyter Notebook tutorial at tutorials/sklearn/catwalk_sklearn_tutorial

Data scientists with a basic understanding in developing models using Jupyter notebook.
Data scientists learning to prepare machine learning models for deployment.

You will train and save a simple scikit-learn model then wrap it with Catwalk.
For more details of the model, please refer to the scikit-learn linear regression example.
You will create a model and saves it with MLflow.

Open Jupyter notebook and load the tutorial notebook.
Run the cells one by one. In the first cell you will install the dependencies if you do not have them installed. If you already have the dependencies installed, you will see Requirement already satisfied. Otherwise, wait for the packages to be downloaded and installed. It should only take a few minutes.
You will then load an example dataset which should return the following output:

Number of training examples: 422
Number of testing examples: 20

Then you will train a very simple linear regression model using the default parameters, and should see the output:

LinearRegression(copy_X=True, 
fit_intercept=True, 
n_jobs=None,
normalize=False)

The model evaluation step will show you the Coefficients, Mean squared error, and Coefficient of determination.
You will then plot the result, which should look like this:
You will saved the model properties and structure in pickle format.
The model itself will be saved in a .py format, in the tutorial it will be saved as model.py.
Then you will save model metadata in YAML format, which contains the model's name, version and contact information, and validation information.
In this tutorial, the requirements.txt will just contain the package sklearn; you can include other packages that you have used in your development.
You can then test the model and server to check whether the implementation is expected. The output should return OK for both tests.
In a separate terminal, navigate to catwalk/tutorials/sklearn, and then start the catwalk server $ catwalk serve --debug. When the server is ready, you will see the message * Running on http://0.0.0.0:9090/.
Then you can run the last two cells in the notebook, which requests and returns model metadata; and sends a value to the model for prediction. The outputs will be returned in json format.
Stop the server in the terminal by CTRL-C.

After running the tutorial notebook, you will see the following items in your sklearn folder:

catwalk_sklearn_tutorial.ipynb - The tutorial notebook
docker-compose.yml - docker compose file containing configuration info of docker
Dockerfile - the file which contains all commands and info to assemble an image
model.pkl - the pickle file you have generated which contains model artifact
model.py - the python model
model.yml - model metadata
requirements.txt - python packages required for running the model