Transpile trained scikit-learn models to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.
Algorithm | Programming language | |||||
Classification | C | Java | JavaScript | Go | PHP | Ruby |
sklearn.svm.SVC | ✓ | ✓ | ✓ | ✓ | ||
sklearn.svm.NuSVC | ✓ | ✓ | ✓ | ✓ | ||
sklearn.svm.LinearSVC | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
sklearn.tree.DecisionTreeClassifier | ✓ | ✓ | ✓ | ✓ | ||
sklearn.ensemble.RandomForestClassifier | ✓ | ✓ | ✓ | |||
sklearn.ensemble.ExtraTreesClassifier | ✓ | ✓ | ✓ | |||
sklearn.ensemble.AdaBoostClassifier | ✓ | ✓ | ✓ | |||
sklearn.neighbors.KNeighborsClassifier | ✓ | ✓ | ||||
sklearn.neural_network.MLPClassifier | ○ | ○ | ||||
sklearn.naive_bayes.GaussianNB | ✓ | ✓ | ||||
sklearn.naive_bayes.BernoulliNB | ✓ | |||||
Regression | ||||||
sklearn.neural_network.MLPRegressor | ✓ |
✓ = is full-featured, ○ = has minor exceptions
pip install sklearn-porter
If you want the latest bleeding edge changes, you can install the module from the master (development) branch:
pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master
- python>=2.7.3
- scikit-learn>=0.14.1
If you want to transpile a multilayer perceptron (sklearn.neural_network.MLPClassifier), you have to upgrade the scikit-learn package:
- scikit-learn>=0.18.0
The following example shows how you can port a decision tree model to Java:
from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter
# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)
# Export:
porter = Porter(clf, language='java')
output = porter.export()
print(output)
The exported result matches the official human-readable version of the decision tree.
Run the prediction(s) in the target programming language directly:
# ...
porter = Porter(clf, language='java')
# Prediction(s):
Y_preds = porter.predict(X)
y_pred = porter.predict(X[0])
y_pred = porter.predict([1., 2., 3., 4.])
Always compute the accuracy between the original and the ported estimator:
# ...
porter = Porter(clf, language='java')
# Accuracy:
accuracy = porter.predict_test(X)
print(accuracy) # 1.0
This example shows how you can port a model from the command line. First of all you have to store the model to the pickle format:
# ...
# Extract estimator:
joblib.dump(clf, 'model.pkl')
After that the model can be transpiled by using the following command:
python -m sklearn_porter --input <pickle_file> [--output <destination_dir>] [--language {c,go,java,js,php,ruby}]
python -m sklearn_porter -i <pickle_file> [-o <destination_dir>] [-l {c,go,java,js,php,ruby}]
The following commands have all the same result:
python -m sklearn_porter --input model.pkl --language java
python -m sklearn_porter -i model.pkl -l java
By changing the language parameter you can set the target programming language:
python -m sklearn_porter -i model.pkl -l c
python -m sklearn_porter -i model.pkl -l go
python -m sklearn_porter -i model.pkl -l java
python -m sklearn_porter -i model.pkl -l js
python -m sklearn_porter -i model.pkl -l php
python -m sklearn_porter -i model.pkl -l ruby
Further information will be shown by using the --help
parameter:
python -m sklearn_porter --help
python -m sklearn_porter -h
Install the required environment modules by executing the script environment.sh:
./recipes/environment.sh
conda config --add channels conda-forge
conda env create -n sklearn-porter python=2 -f environment.yml
source activate sklearn-porter
Furthermore Node.js (>=6
), Java (>=1.6
), PHP (>=7
), Ruby (>=1.9.3
) and GCC (>=4.2
) are required for all tests.
The tests cover module functions as well as matching predictions of transpiled models. Run all tests by executing the script test.sh:
./recipes/test.sh
source activate sklearn-porter
python -m unittest discover -vp '*Test.py'
source deactivate
While you are developing new features or fixes, you can reduce the test duration by setting the number of random model tests:
N_RANDOM_FEATURE_SETS=15 N_EXISTING_FEATURE_SETS=30 python -m unittest discover -vp '*Test.py'
It's highly recommended to ensure the code quality. For that I use Pylint, which you can run by executing the script lint.sh:
./recipes/lint.sh
source activate sklearn-porter
find ./sklearn_porter -name '*.py' -exec pylint {} \;
source deactivate
Don't be shy and feel free to contact me on Twitter or Gitter.
The module is Open Source Software released under the MIT license.