Home

What is catwalk?

Simply put, catwalk is a model wrapping and serving platform (hence the name), for your python data science models. It provides a simple and automated method to wrap and test a generic python-based model into a production-ready, dockerised REST API server.

The catwalk package is made for:

Data scientists looking for an efficient and effective way to productionise your model,
Data engineers that want to build, maintain and test production data pipelines, and
Infrastructure engineers exploring ways to deploy data science models to production.

What does catwalk do?

Make it quick and easy for data scientists to get their models to production
Ensure robustness with thorough testing (the model, server, docker image, and input/output data)
Productionisation of models is handled automatically, using standardised best practices
Models can be versioned and built into a deployment-ready, secure and scalable docker images

This is done through the catwalk command line tool:

catwalk test-model tests the model against test data and I/O schema
catwalk serve wraps the model and creates a REST API that validates model input and output
catwalk test-server tests the model server
catwalk build-prep creates standard build files (Dockerfile, nginx configuration, ...)
catwalk build builds a secure and scalable docker image
catwalk test-image tests the docker image
catwalk deploy-prep creates standard deployment files (docker-compose.yml, ...)

Using the above commands you can swiftly wrap models via a CI/CD pipeline for cloud deployment.

Where does catwalk fit into the Data Science Process?

A data scientist can build their model however they wish, using any (pythonic) tools they like, then wrap the result in catwalk. A CI pipeline can then automate test-build-test-package-test, and an engineer or CD pipeline can receive a production-ready artifact to launch into production.

catwalk helps guide decisions on productionization
catwalk helps document and package models once they are trained
catwalk streamlines the steps from "I have a trained model" to "I have a model ready for production" into two small files and a simple CLI.
catwalk happens before your production environment, and is agnostic to the details of the production environment (although by default it assumes REST communication between containers)
catwalk is agnostic to precise CI/CD tools (by default assumes dockerisation)
catwalk is agnostic to the model and training regime/environment (except for assuming python at the moment)

How does catwalk compare to similar packages?

catwalk is heavily influenced by several industry-leading open source projects (Amazon SageMaker, RedHat OpenShift S2I, DataBricks MLflow and Google Kubeflow).

Feature	MLflow	Kubeflow	catwalk
Python support	✔️	✔️	✔️
Other languages support	✔️	❌	❌
Command line tool	✔️	✔️	✔️
Model training	✔️	✔️	❌
Model testing	❌	❌	✔️
Model serving	✔️	✔️	✔️
Model I/O schema validation	❌	❌	✔️
SSL support	❌	✔️	✔️
Stateless API	❌	❌	✔️
Docker build	✔️	✔️	✔️
Model deployment	❌	✔️	❌

Learn more

Want to learn more about catwalk? Here, you can find some step-by-step guides:

And here you can find further explanations about different parts of catwalk:

Licensing of Catwalk

Software: http://opensource.org/licenses/Apache-2.0 (c) 2019 Leap Beyond Emerging Technologies B.V. (unless otherwise stated)
Documentation: http://creativecommons.org/licenses/by/4.0/ where not covered by above (c) 2019 Leap Beyond Emerging Technologies B.V. (unless otherwise stated)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly