sklearn-porter

Transpile trained scikit-learn models to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.

Machine learning algorithms

Algorithm	Programming language
Classification	C	Java	JavaScript	Go	PHP	Ruby
sklearn.svm.SVC	✓	✓	✓		✓
sklearn.svm.NuSVC	✓	✓	✓		✓
sklearn.svm.LinearSVC	✓	✓	✓	✓	✓	✓
sklearn.tree.DecisionTreeClassifier	✓	✓	✓		✓
sklearn.ensemble.RandomForestClassifier	✓	✓	✓
sklearn.ensemble.ExtraTreesClassifier	✓	✓	✓
sklearn.ensemble.AdaBoostClassifier	✓	✓	✓
sklearn.neighbors.KNeighborsClassifier		✓	✓
sklearn.neural_network.MLPClassifier		○	○
sklearn.naive_bayes.GaussianNB		✓	✓
sklearn.naive_bayes.BernoulliNB		✓
Regression
sklearn.neural_network.MLPRegressor			✓

✓ = is full-featured, ○ = has minor exceptions

Installation

pip install sklearn-porter

If you want the latest bleeding edge changes, you can install the module from the master (development) branch:

pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

Minimum requirements

- python>=2.7.3
- scikit-learn>=0.14.1

If you want to transpile a multilayer perceptron (sklearn.neural_network.MLPClassifier), you have to upgrade the scikit-learn package:

- scikit-learn>=0.18.0

Usage

Export

The following example shows how you can port a decision tree model to Java:

from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter

# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

# Export:
porter = Porter(clf, language='java')
output = porter.export()
print(output)

The exported result matches the official human-readable version of the decision tree.

Prediction

Run the prediction(s) in the target programming language directly:

# ...
porter = Porter(clf, language='java')

# Prediction(s):
Y_preds = porter.predict(X)
y_pred = porter.predict(X[0])
y_pred = porter.predict([1., 2., 3., 4.])

Accuracy

Always compute the accuracy between the original and the ported estimator:

# ...
porter = Porter(clf, language='java')

# Accuracy:
accuracy = porter.predict_test(X)
print(accuracy) # 1.0

Command-line interface

This example shows how you can port a model from the command line. First of all you have to store the model to the pickle format:

# ...

# Extract estimator:
joblib.dump(clf, 'model.pkl')

After that the model can be transpiled by using the following command:

python -m sklearn_porter --input <pickle_file> [--output <destination_dir>] [--language {c,go,java,js,php,ruby}]
python -m sklearn_porter -i <pickle_file> [-o <destination_dir>] [-l {c,go,java,js,php,ruby}]

The following commands have all the same result:

python -m sklearn_porter --input model.pkl --language java
python -m sklearn_porter -i model.pkl -l java

By changing the language parameter you can set the target programming language:

python -m sklearn_porter -i model.pkl -l c
python -m sklearn_porter -i model.pkl -l go
python -m sklearn_porter -i model.pkl -l java
python -m sklearn_porter -i model.pkl -l js
python -m sklearn_porter -i model.pkl -l php
python -m sklearn_porter -i model.pkl -l ruby

Further information will be shown by using the --help parameter:

python -m sklearn_porter --help
python -m sklearn_porter -h

Development

Environment

Install the required environment modules by executing the script environment.sh:

./recipes/environment.sh

conda config --add channels conda-forge
conda env create -n sklearn-porter python=2 -f environment.yml
source activate sklearn-porter

Furthermore Node.js (>=6), Java (>=1.6), PHP (>=7), Ruby (>=1.9.3) and GCC (>=4.2) are required for all tests.

Testing

The tests cover module functions as well as matching predictions of transpiled models. Run all tests by executing the script test.sh:

./recipes/test.sh

source activate sklearn-porter
python -m unittest discover -vp '*Test.py'
source deactivate

While you are developing new features or fixes, you can reduce the test duration by setting the number of random model tests:

N_RANDOM_FEATURE_SETS=15 N_EXISTING_FEATURE_SETS=30 python -m unittest discover -vp '*Test.py'

Quality

It's highly recommended to ensure the code quality. For that I use Pylint, which you can run by executing the script lint.sh:

./recipes/lint.sh

source activate sklearn-porter
find ./sklearn_porter -name '*.py' -exec pylint {} \;
source deactivate

Questions?

Don't be shy and feel free to contact me on Twitter or Gitter.

License

The module is Open Source Software released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 381 Commits
examples		examples
recipes		recipes
sklearn_porter		sklearn_porter
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
.ruby-version		.ruby-version
.travis.yml		.travis.yml
MANIFEST.in		MANIFEST.in
environment.yml		environment.yml
license.txt		license.txt
readme.md		readme.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sklearn-porter

Machine learning algorithms

Installation

Minimum requirements

Usage

Export

Prediction

Accuracy

Command-line interface

Development

Environment

Testing

Quality

Questions?

License

About

Releases

Packages

Languages

License

QuantJia/sklearn-porter

Folders and files

Latest commit

History

Repository files navigation

sklearn-porter

Machine learning algorithms

Installation

Minimum requirements

Usage

Export

Prediction

Accuracy

Command-line interface

Development

Environment

Testing

Quality

Questions?

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages