Octave based Machine-learning routines
# ensure octave is installed
# get code
$ git clone https://github.com/partharamanujam/octave-ml.git
# include .../octave-ml/octavelib in the octave path
# now check examples folder
- Normal Equation - Linear Regression
- Gradient Descent - Linear Regression
- Gradient Descent - Logistic Regression
- Support Vector Machines - Classification
- Neural Networks - Classification : ToDo
- K Means - Clustering
- Anomaly Detection - (Multivariate) Gaussian Distribution
- Recommender Systems - Collaborative filtering (Low Rank Matrix Factorization)
- Feature Scaling/Normalization
- Principal Component Analysis (Dimensionality Reduction)
- Octave 3.6.4 or above - https://www.gnu.org/software/octave/download.html
- Octave-Forge packages - http://octave.sourceforge.net
- specfun
- image
- LIBSVM for Octave - http://www.csie.ntu.edu.tw/~cjlin/libsvm
Some of the commonly used Machine-learning and support routines implemented in Octave. This is to provide a starting point for more advanced work.
See examples folder for usage.
Features are inputs from the training-set to be used for machine-learning. This is usually represented by the matrix-variable "X".
Outputs refer to the actual/known results corresponding to the input-features from the training-set. This is usually represented by the vector-variable "y".
Bias refers to the erroneous assumptions in the learning algorithm, and Variance refers to the error from sensitivity to small fluctuations in the training set. For more details, refer to Bias Variance Tradeoff
Theta refers to the hypothesis of coefficients/parameters that map/fit the input-features to the output-results. This is usually represented by the vector-variable "theta" (or Theta).
Lambda is the regularization parameter used to manage fitting of parameters. This is usually represented by the vector-variable "lambda".
Feature scaling/normalization is the process of modifying the input-features to allow for better fitting. This is usually done using a combination of mean (represented by parameter mu), and standard-deviation (represented by parameter sigma). Note that the bias-term is usually not scaled/normalized.
Estimated-value refers to the predicted value for a given set of input-features using previously computed theta from the training-set. This is usually represented by the variable "p".
This section provides a list of various machine-learning and support routines available. For more detailed information, please look at the embedded documentation using 'help' for the specific routine.
- This code: MIT
- Porter-Stemmer: BSD
- fmincg: © Copyright 1999, 2000 & 2001, Carl Edward Rasmussen
- Examples data: Courtesy ML-007 class