Skip to content

Releases: yoshoku/rumale

0.23.0

04 Apr 02:51
Compare
Choose a tag to compare
  • Changed automalically selected solver from sgd to lbfgs in LinearRegression and Ridge.
    require 'rumale'
    
    reg = Rumale::LinearModel::Ridge.new(solver: 'auto')
    pp reg.params[:solver]
    # > "lbfgs"
    
    require 'numo/linalg/autoloader'
    
    reg = Rumale::LinearModel::Ridge.new(solver: 'auto')
    pp reg.params[:solver]
    # > "svd"

0.22.5

14 Mar 02:17
Compare
Choose a tag to compare
  • Added a new transformer that calculates the kernel matrix of given samples.

    regressor = Rumale::Pipeline::Pipeline.new(
      steps: {
        ker: Rumale::Preprocessing::KernelCalculator.new(kernel: 'rbf', gamma: 0.5),
        krr: Rumale::KernelMachine::KernelRidge.new(reg_param: 1.0)
      }
    )
    regressor.fit(x_train, y_train)
    res = regressor.predict(x_test)
  • Added a new classifier based on kenelized ridge regression.

    classifier = Rumale::KernelMachine::KernelRidgeClassifier.new(reg_param: 1.0)
  • Nystroem now supports linear, polynomial, and sigmoid kernel functions.

    nystroem = Rumale::KernelApproximation::Nystroem.new(
                 kernel: 'poly', gamma: 1, degree: 3, coef: 8, n_components: 256
               )
  • load_libsvm_file has a new parameter n_features for specifying the number of features.

    x, y = Rumale::Dataset.load_libsvm_file('mnist.t', n_features: 780)
    # p x.shape
    # => [10000, 780]

0.22.4

22 Feb 23:44
Compare
Choose a tag to compare
  • Added classifier and regressor classes with voting-based ensemble method that combines estimators using majority voting.
require 'numo/openblas'
require 'parallel'
require 'rumale'

# ... Loading dataset

clf = Rumale::Ensemble::VotingClassifier.new(
  estimators: {
    log: Rumale::LinearModel::LogisticRegression.new(random_seed: 1),
    rnd: Rumale::Ensemble::RandomForestClassifier.new(n_jobs: -1, random_seed: 1),
    ext: Rumale::Ensemble::ExtraTreesClassifier.new(n_jobs: -1, random_seed: 1)
  },
  weights: {
    log: 0.5,
    rnd: 0.3,
    ext: 0.2
  },
  voting: 'soft'
)

clf.fit(x, y)

0.22.3

23 Jan 05:34
Compare
Choose a tag to compare
  • Added regressor class for non-negative least squares (NNLS) method. NNLS is a linear regression method that constrains non-negativeness to the coefficients.
require 'rumale'

rng = Random.new(1)
n_samples = 200
n_features = 100

# Prepare example data set.
x = Rumale::Utils.rand_normal([n_samples, n_features], rng)

coef = Rumale::Utils.rand_normal([n_features, 1], rng)
coef[coef.lt(0)] = 0.0 # Non-negative coefficients

noise = Rumale::Utils.rand_normal([n_samples, 1], rng)

y = x.dot(coef) + noise

# Split data set with holdout method.
x_train, x_test, y_train, y_test = Rumale::ModelSelection.train_test_split(x, y, test_size: 0.4, random_seed: 1)

# Fit non-negative least squares.
nnls = Rumale::LinearModel::NNLS.new(reg_param: 1e-4, random_seed: 1).fit(x_train, y_train)
puts(format("NNLS R2-Score: %.4f", nnls.score(x_test, y_test)))

# Fit ridge regression.
ridge = Rumale::LinearModel::Ridge.new(solver: 'lbfgs', reg_param: 1e-4, random_seed: 1).fit(x_train, y_train)
puts(format("Ridge R2-Score: %.4f", ridge.score(x_test, y_test)))
$ ruby nnls.rb
NNLS R2-Score: 0.9478
Ridge R2-Score: 0.8602

0.22.2

10 Jan 04:52
Compare
Choose a tag to compare
  • Added classifier and regressor classes for stacked generalization that is a method for combining estimators to improve prediction accuracy:
require 'numo/openblas'
require 'parallel'
require 'rumale'

# ... Loading dataset

clf = Rumale::Ensemble::StackingClassifier.new(
  estimators: {
    rnd: Rumale::Ensemble::RandomForestClassifier.new(max_features: 4, n_jobs: -1, random_seed: 1),
    ext: Rumale::Ensemble::ExtraTreesClassifier.new(max_features: 4, n_jobs: -1, random_seed: 1),
    grd: Rumale::Ensemble::GradientBoostingClassifier.new(n_jobs: -1, random_seed: 1),
    rdg: Rumale::LinearModel::LogisticRegression.new
  },
  meta_estimator: Rumale::LinearModel::LogisticRegression.new(reg_param: 1e2),
  random_seed: 1
)

clf.fit(x, y)

0.22.1

05 Dec 04:42
Compare
Choose a tag to compare
  • Add transfomer class for MLKR (Metric Learning for Kernel Regression) that performs transformation/projection along to the target variables. The following are examples of transforming toy example data using PCA and MLKR.
require 'rumale'

def make_regression(n_samples: 500, n_features: 10, n_informative: 4, n_targets: 1)
  n_informative = [n_features, n_informative].min

  rng = Random.new(42)
  x = Rumale::Utils.rand_normal([n_samples, n_features], rng)

  ground_truth = Numo::DFloat.zeros(n_features, n_targets)
  ground_truth[0...n_informative, true] = 100 * Rumale::Utils.rand_uniform([n_informative, n_targets], rng)
  y = x.dot(ground_truth)
  y = y.flatten

  rand_ids = Array(0...n_samples).shuffle(random: rng)
  x = x[rand_ids, true].dup
  y = y[rand_ids].dup

  [x, y]
end

x, y = make_regression

pca = Rumale::Decomposition::PCA.new(n_components: 10)
z_pca = pca.fit_transform(x, y)
mlkr = Rumale::MetricLearning::MLKR.new(n_components: nil, init: 'pca')
z_mlkr = mlkr.fit_transform(x, y)

# After that, these results are visualized by multidimensional scaling.

PCA:
pca

MLKR:
mlkr

0.22.0

22 Nov 08:10
Compare
Choose a tag to compare
  • Add lbfgsb.rb gem to runtime dependencies for optimization.
    This eliminates the need to require the mopti gem when using NeighbourhoodComponentAnalysis. Moreover, the lbfgs solver has been added to LogisticRegression, and the default solver changed from 'sgd' to 'lbfgs'. In many cases, the lbfgs solver is faster and more stable than the sgd solver.

0.21.0

03 Oct 04:13
Compare
Choose a tag to compare
  • Change the default value of max_iter on LinearModel estimators to 1000 from 200.
    The LinearModel estimators use stochastic gradient descent method for optimization. For convergence, it is better to set a large value for the number of iterations. Thus, Rumale has increased the default value of the max_iter.