-
Notifications
You must be signed in to change notification settings - Fork 313
Home
Hollin Wilkins edited this page Dec 29, 2016
·
55 revisions
MLeap deploys Spark ML (and some MLlib) transformers and pipelines to production without a Spark Context.
MLeap extends scikit-learn's functionality to be able to serialize and deploy scikit transformers, pipelines and feature unions without any dependencies on scikit (numpy, scipy, c++ libraries). It also serializes transformers and pipelines as Spark, so you can load and deploy your scikit pipelines on Spark infrastructure with a few lines of code.
- Serializing a Spark ML Pipeline and Serving with MLeap
- Setting up a Spark 2.0 notebook with MLeap and Toree
- Setting up PySpark 2.0 notebook with MLeap and Toree
- Setting up Scikit-Learn with MLeap
- ML Pipelines with AirBnb data - Scala
- ML Pipelines with AirBnb data - PySpark
- ML Pipelines with Lending Club data - Scala
Transformer | Spark | Scikit-Learn | TensorFlow |
---|---|---|---|
Binarizer | x | x | |
BucketedRandomProjectionLSH | x | ||
Bucketizer | x | ||
ChiSqSelector | x | ||
CountVectorizer | x | ||
DCT | x | ||
ElementwiseProduct | x | x | |
HashingTermFrequency | x | x | |
IDF | x | ||
Imputer | x | x | |
Interaction | x | x | |
MaxAbsScaler | x | ||
MinHashLSH | x | ||
MinMaxScaler | x | x | |
Ngram | x | ||
Normalizer | x | ||
OneHotEncoder | x | x | |
PCA | x | x | |
QuantileDiscretizer | x | ||
PolynomialExpansion | x | x | |
ReverseStringIndexer | x | x | |
StandardScaler | x | x | |
StopWordsRemover | x | ||
StringIndexer | x | x | |
Tokenizer | x | x | |
VectorAssembler | x | x | |
VectorIndexer | |||
VectorSlicer | x | ||
WordToVector | x |
Transformer | Spark | Scikit-Learn | TensorFlow |
---|---|---|---|
DecisionTreeClassifier | x | x | |
GradientBoostedTreeClassifier | x | ||
LogisticRegression | x | x | |
LogisticRegressionCv | x | x | |
NaiveBayesClassifier | x | ||
OneVsRest | x | ||
RandomForestClassifier | x | x | |
SupportVectorMachines | x | x | |
MultiLayerPerceptron | x |
Transformer | Spark | Scikit-Learn | TensorFlow |
---|---|---|---|
AFTSurvivalRegression | x | ||
DecisionTreeRegression | x | x | |
GeneralizedLinearRegression | x | ||
GradientBoostedTreeRegression | x | ||
IsotonicRegression | x | ||
LinearRegression | x | x | |
RandomForestRegression | x | x |
Transformer | Spark | Scikit-Learn | TensorFlow |
---|---|---|---|
BisectingKMeans | x | ||
GaussianMixtureModel | x | ||
KMeans | x | ||
LDA |
Transformer | Spark | Scikit-Learn | TensorFlow | Description |
---|---|---|---|---|
MathUnary | x | x | Simple set of unary mathematical operations | |
MathBinary | x | x | Simple set of binary mathematical operations |
Transformer | Spark | Scikit-Learn | TensorFlow |
---|---|---|---|
ALS |
- CholeskyDecomposition