Skip to content

Latest commit

 

History

History
165 lines (145 loc) · 7.13 KB

README.md

File metadata and controls

165 lines (145 loc) · 7.13 KB

SmileMiner

SmileMiner (Statistical Machine Intelligence and Learning Engine) is a set of pure Java libraries of various state-of-art machine learning algorithms. SmileMiner is self contained and requires only Java standard library. The major components include

  • Smile The core machine learning library
  • SmileMath Mathematical functions (basic, special, kernel, distance, rbf, etc.), sorting, random number generators, optimization, linear algebra, statistical distributions, and hypothesis testing.
  • SmileData Parsers for arff, libsvm, delimited text, sparse matrix, microarray gene expression data.
  • SmileGraph Graph algorithms on adjacency list and matrix.
  • SmileInterpolation One and two dimensional interpolation.
  • SmilePlot Swing-based data visualization library.

SmileMiner is well documented and you can browse the javadoc for more information. A basic tutorial is available on the project wiki.

To see SmileMiner in action, please download the demo jar file and then run java -jar smile-demo.jar.

You can use the libraries through Maven central repository by adding the following to your project pom.xml file.

    <dependency>
      <groupId>com.github.haifengl</groupId>
      <artifactId>smile-core</artifactId>
      <version>1.0.1</version>
    </dependency>

You can similarily replace artifactId smile-core with smile-math, smile-data, smile-graph, smile-interpolation, or smile-plot for other modules.

SmileMiner implements the following major machine learning algorithms

  • Classification: Support Vector Machines, Decision Trees, AdaBoost, Gradient Boosting, Random Forest, Logistic Regression, Neural Networks, RBF Networks, Maximum Entropy Classifier, KNN, Naïve Bayesian, Fisher/Linear/Quadratic/Regularized Discriminant Analysis.

  • Regression: Support Vector Regression, Gaussian Process, Regression Trees, Gradient Boosting, Random Forest, RBF Networks, OLS, LASSO, Ridge Regression.

  • Feature Selection: Genetic Algorithm based Feature Selection, Ensemble Learning based Feature Selection, Signal Noise ratio, Sum Squares ratio.

  • Clustering: BIRCH, CLARANS, DBScan, DENCLUE, Deterministic Annealing, K-Means, X-Means, G-Means, Neural Gas, Growing Neural Gas, Hierarchical Clustering, Sequential Information Bottleneck, Self-Organizing Maps, Spectral Clustering, Minimum Entropy Clustering.

  • Association Rule & Frequent Itemset Mining: FP-growth mining algorithm

  • Manifold learning: IsoMap, LLE, Laplacian Eigenmap, PCA, Kernel PCA, Probabilistic PCA, GHA, Random Projection

  • Multi-Dimensional Scaling: Classical MDS, Isotonic MDS, Sammon Mapping

  • Nearest Neighbor Search: BK-Tree, Cover Tree, KD-Tree, LSH

  • Sequence Learning: Hidden Markov Model.

SmilePlot

SmileMiner also has a Swing-based data visualization library SmilePlot, which provides scatter plot, line plot, staircase plot, bar plot, box plot, histogram, 3D histogram, dendrogram, heatmap, hexmap, QQ plot, contour plot, surface, and wireframe. The class PlotCanvas provides builtin functions such as zoom in/out, export, print, customization, etc.

SmilePlot requires SwingX library for JXTable. But if your environment cannot use SwingX, it is easy to remove this dependency by using JTable.

To use SmilePlot, add the following to dependencies

    <dependency>
      <groupId>com.github.haifengl</groupId>
      <artifactId>smile-plot</artifactId>
      <version>1.0.1</version>
    </dependency>

Demo Gallery

Kernel PCA Kernel PCA IsoMap IsoMap
MDS Multi-Dimensional Scaling SOM SOM
Neural Network Neural Network SVM SVM
Agglomerative Clustering Agglomerative Clustering X-Means X-Means
DBScan DBScan Neural Gas Neural Gas
Wavelet Wavelet Mixture Exponential Family Mixture