Building your own model

Models should be developed independently in secure development environments. Models servers themselves will run in the catwalk base image, a python v3.7 environment, and it is therefore recommended to use this specific python version when developing the model.

Interface

Models have two requirements:

A specification in model.yml
An implementation in model.py

Models may also contain an optional requirements.txt file to manage pip dependencies.

Note: At this time catwalk is restricted to only support dependencies installable via pip.

Specification

Models need a specification inside a model.yml file, containing:

name: "Model name (str)"
version: "Model version (str)"

contact:
  name: "Contact name (str)"
  email: "Contact email (str)"

schema:
  input: "The input schema of the model in OpenAPI format (object / array)"
  output: "The output schema of the model in OpenAPI format (object / array)"

This specification file is used to validate the incoming data posted to the server, as well as to form the docker tag from the model name and version.

Implementation

Models must be implemented in a model.py file, in a single class called Model. This is the interface:

class Model(object):
    """The Model knows how to load itself, provides test data and runs with `Model::predict`.
    """

    def __init__(self, path="."):
        """The Model constructor.

        Use this to initialise your model, including loading any weights etc.

        :param str path: The full path to the folder in which the model is located.
        """
        pass

    def load_test_data(self, path=".") -> (list, list):
        """Loads and returns test data.

        Format of the returned data is similar to pd.DataFrame.records, a list of key-value pairs.

        :param str path: The full path to the folder in which the model is located.
        :return: Tuple of feature, target lists.
        """
        pass

    def predict(self, X) -> dict:
        """Uses the model to predict a value.

        :param dict X: The features to predict against
        :return: The prediction result
        """
        pass

Examples

Example models are included in this repo for reference and convenience. Simply run them with your local python.

$ cd example_models/rng
$ python model.py

Support for pandas DataFrames

The pandas DataFrame is the go-to tool for many a pythonic Data Scientist. To add support for DataFrames in the Model.predict() method, specify io_type: PANDAS_DATA_FRAME in the model.yml. This will ensure that the X argument passed in is a pre-constructed DataFrame. Note that you must return a DataFrame from the Model.predict() method as well!

Important points:

The model's IO schema can either be in "records" format ([{column -> value}, … , {column -> value}]) or simplified to a single record ({column -> value}).
Similarly, the server will accept input data in both these formats.
pandas must be installed by a model's requirements.txt, to avoid binary or API incompatibilities between versions.

See example_models/dataframe for an example.

Model tests

In a CI/CD pipeline, Models are unit tested before they can be safely wrapped and deployed. Note that a model may have it's own set of requirements, which will be installed by catwalk in these tests.

$ catwalk test-model --model-path /path/to/your/model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly