-
Notifications
You must be signed in to change notification settings - Fork 108
How to Deploy a Shifu Model to Production
Shifu is not only a high-performance machine learning model training framework, but also a good execution framework to support different kinds of model formats. The major formats are two in Shifu, one is PMML - standard machine learning model description, the other is native binary model format. PMML format and native format are both supported in LR/NN/RF/GBDT algorithms.
After model is trained well in train step, by using export command you can get a PMML models in local pmmls folder.
shifu export -t pmml
Sample NN PMML model can be found in NN PMML. By using jPMML, such model is executed well in prediction, sample code can be found in NN Example Code, which means in production you can deploy such PMML model by a PMML execution engine.
Sample tree ensemble model (RF/GBDT) can be found in Tree PMML Model. Such model can also be executed by jPMML.
GBDT/RF is native binary model format in Shifu. After training such models can be found in your local models folder with postfix rf or gbt. The reason to compress the model is to get a smaller size in case of some applications like GBDT with 3000 trees. Our test in 3000 trees with depth 6 is about 4MB. Sample model can be found here. Binary execution engine code is IndependentTreeModel. The only dependency to depeloy such prediction engine is shifu-.jar and guagua-.jar, which is very easy to be integrated into your production code. Check here for jar dependency.
Compared with PMML format tree ensemble model, our execution engine performs much better for SLA. For 2000 depth-6 trees, the average execution time is 3ms while PMML is over 50ms.
Please check code for such binary model scoring.
As Encog is leveraged in NN training, the default model format is encog format like example. While such Encog NN model doesn't contain feature transform part, to support a self-containable model, a binary format model is defined, check one sample here. This binary model can be executed by IndependentNNModel. The only dependencies are shifu-.jar, guagua-.jar, encog-core-3.0.0.jar and slf4j log jars.
The reason why binary format is to get smaller model and the average SLA is much better compared with PMML NN model. Our execution predicts a average 2ms NN model, while it is 10ms in jPMML engine.
Please check code for such binary model scoring.
In the before, if 5 bagging models are generated, they are independent model specs; and in production, such 5 models have to be deployed and executed independently. This causes two issues:
- Not easy to deploy
- Bad efficiency in prediction since every model needs to do transform before feeding into NN prediction.
To solve such issues, Shifu supports a unified NN/RF/GBDT Bagging model for both PMML and binary format by these two commands:
shifu export -t bagging
shifu export -t baggingpmml
The first bagging model, which is in local onebaggingmodel/model.bnn(model.brf or model.bgbt), is binary model can be executed by the same IndependentNNModel or IndependentTreeModel. The second command will export one PMML bagging model in pmmls folder.
One bagging native model is proved to be a good performance in production because of high-efficient prediction engine in Shifu and feature transform will be executed once no matter how many models. So far, although PMML unified bagging model is only one model spec but feature transform still not once.