-
Notifications
You must be signed in to change notification settings - Fork 313
Home
MLeap deploys Spark ML (and some MLlib) transformers and pipelines to production without a Spark Context.
MLeap extends scikit-learn's functionality to be able to serialize and deploy scikit transformers, pipelines and feature unions without any dependencies on scikit (numpy, scipy, c++ libraries). It also serializes transformers and pipelines as Spark, so you can load and deploy your scikit pipelines on Spark infrastructure with a few lines of code.
- Serializing a Spark ML Pipeline and Serving with MLeap
- Setting up a Spark 2.0 notebook with MLeap and Toree
- Setting up PySpark 2.0 notebook with MLeap and Toree
- Setting up Scikit-Learn with MLeap
- ML Pipelines with AirBnb data - Scala
- ML Pipelines with AirBnb data - PySpark
- ML Pipelines with Lending Club data - Scala
Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
---|---|---|---|---|
Binarizer | x | x | x | |
BucketedRandomProjectionLSH | x | x | ||
Bucketizer | x | x | ||
ChiSqSelector | x | x | ||
CountVectorizer | x | x | ||
DCT | x | x | ||
ElementwiseProduct | x | x | x | |
HashingTermFrequency | x | x | x | |
IDF | x | x | ||
Imputer | x | x | x | |
Interaction | x | x | x | |
MaxAbsScaler | x | x | ||
MinHashLSH | x | x | ||
MinMaxScaler | x | x | x | |
Ngram | x | x | ||
Normalizer | x | x | ||
OneHotEncoder | x | x | ||
PCA | x | x | x | |
QuantileDiscretizer | x | x | ||
PolynomialExpansion | x | x | x | |
ReverseStringIndexer | x | x | x | |
StandardScaler | x | x | x | |
StopWordsRemover | x | x | ||
StringIndexer | x | x | x | |
Tokenizer | x | x | x | |
VectorAssembler | x | x | x | |
VectorIndexer | x | x | ||
VectorSlicer | x | x | ||
WordToVector | x | x |
| Transformer | Spark| MLeap | Scikit-Learn | TensorFlow | | ------------- |:-------------:| -----:| -----:| | DecisionTreeClassifier | x | x | x | | | GradientBoostedTreeClassifier | x | x | | | | LogisticRegression | x | x | x | | | LogisticRegressionCv | x | x | x | | | NaiveBayesClassifier | x | x | | | | OneVsRest | x | x | | | | RandomForestClassifier | x | x | x | | | SupportVectorMachines | x | x | x | | | MultiLayerPerceptron | x | x | | |
| Transformer | Spark | MLeap | Scikit-Learn | TensorFlow | | ------------- |:-------------:| -----:| -----:| | AFTSurvivalRegression | x | x | | | | DecisionTreeRegression | x | x | x | | | GeneralizedLinearRegression | x | x | | | | GradientBoostedTreeRegression | x | x | | | | IsotonicRegression | x | x | | | | LinearRegression | x | x | x | | | RandomForestRegression | x | x | x | |
Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
---|---|---|---|---|
BisectingKMeans | x | x | ||
GaussianMixtureModel | x | x | ||
KMeans | x | x | ||
LDA | x |
Transformer | Spark | MLeap | Scikit-Learn | TensorFlow | Description |
---|---|---|---|---|---|
MathUnary | x | x | x | Simple set of unary mathematical operations | |
MathBinary | x | x | x | Simple set of binary mathematical operations |
Transformer | Spark | MLeap | Scikit-Learn | TensorFlow |
---|---|---|---|---|
ALS | x |