-
Notifications
You must be signed in to change notification settings - Fork 1
Jupyter Notebook Step by Step Guide
mirandachong edited this page May 13, 2020
·
1 revision
You can find a Jupyter Notebook tutorial at tutorials/sklearn/catwalk_sklearn_tutorial
- Data scientists with a basic understanding in developing models using Jupyter notebook.
- Data scientists learning to prepare machine learning models for deployment.
- You will train and save a simple scikit-learn model then wrap it with Catwalk.
- For more details of the model, please refer to the scikit-learn linear regression example.
- You will create a model and saves it with MLflow.
- Open Jupyter notebook and load the tutorial notebook.
- Run the cells one by one. In the first cell you will install the dependencies if you do not have them installed. If you already have the dependencies installed, you will see
Requirement already satisfied
. Otherwise, wait for the packages to be downloaded and installed. It should only take a few minutes. - You will then load an example dataset which should return the following output:
Number of training examples: 422
Number of testing examples: 20
- Then you will train a very simple linear regression model using the default parameters, and should see the output:
LinearRegression(copy_X=True,
fit_intercept=True,
n_jobs=None,
normalize=False)
- The model evaluation step will show you the
Coefficients
,Mean squared error
, andCoefficient of determination
. - You will then plot the result, which should look like this:
- You will saved the model properties and structure in pickle format.
- The model itself will be saved in a
.py
format, in the tutorial it will be saved asmodel.py
. - Then you will save model metadata in YAML format, which contains the model's name, version and contact information, and validation information.
- In this tutorial, the
requirements.txt
will just contain the packagesklearn
; you can include other packages that you have used in your development. - You can then test the model and server to check whether the implementation is expected. The output should return
OK
for both tests. - In a separate terminal, navigate to
catwalk/tutorials/sklearn
, and then start the catwalk server$ catwalk serve --debug
. When the server is ready, you will see the message* Running on http://0.0.0.0:9090/
. - Then you can run the last two cells in the notebook, which requests and returns model metadata; and sends a value to the model for prediction. The outputs will be returned in
json
format. - Stop the server in the terminal by
CTRL-C
.
After running the tutorial notebook, you will see the following items in your sklearn folder:
- catwalk_sklearn_tutorial.ipynb - The tutorial notebook
- docker-compose.yml - docker compose file containing configuration info of docker
- Dockerfile - the file which contains all commands and info to assemble an image
- model.pkl - the pickle file you have generated which contains model artifact
- model.py - the python model
- model.yml - model metadata
- requirements.txt - python packages required for running the model
Copyright 2020 Leap Beyond Emerging Technologies B.V. (CC BY 4.0 )