Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to CRAN version 1.0.7 #340

Merged
merged 11 commits into from
Mar 22, 2021
11 changes: 10 additions & 1 deletion .ci/r_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@ if [[ $OS_NAME == "macos-latest" ]]; then
echo 'options(install.packages.check.source = "no")' >> .Rprofile
else
tlmgr --verify-repo=none update --self
tlmgr --verify-repo=none install ec
tlmgr --verify-repo=none install ec hyperref iftex infwarerr kvoptions pdftexcmds

echo "Sys.setenv(RETICULATE_PYTHON = '$CONDA_PREFIX/bin/python')" >> .Rprofile
fi

R_LIB_PATH=$HOME/R
Expand All @@ -26,6 +28,13 @@ echo "R_LIBS=$R_LIB_PATH" > .Renviron
export _R_CHECK_CRAN_INCOMING_=0
export _R_CHECK_CRAN_INCOMING_REMOTE_=0

# increase the allowed time to run the examples
export _R_CHECK_EXAMPLE_TIMING_THRESHOLD_=30

# fix the 'unable to verify current time' NOTE
# see: https://stackoverflow.com/a/63837547/8302386
export _R_CHECK_SYSTEM_CLOCK_=0

if [[ $OS_NAME == "macos-latest" ]]; then
Rscript -e "install.packages('devtools', dependencies = TRUE, repos = 'https://cran.r-project.org')"
fi
Expand Down
7 changes: 7 additions & 0 deletions .ci/r_tests_windows.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@ Remove-Item C:\rtools40 -Force -Recurse -ErrorAction Ignore
$env:_R_CHECK_CRAN_INCOMING_ = 0
$env:_R_CHECK_CRAN_INCOMING_REMOTE_ = 0

# increase the allowed time to run the examples
$env:_R_CHECK_EXAMPLE_TIMING_THRESHOLD_ = 30

# fix the 'unable to verify current time' NOTE
# see: https://stackoverflow.com/a/63837547/8302386
$env:_R_CHECK_SYSTEM_CLOCK_ = 0

$R_VER = "4.0.4"
$ProgressPreference = "SilentlyContinue" # progress bar bug extremely slows down download speed
Invoke-WebRequest -Uri https://cloud.r-project.org/bin/windows/base/old/$R_VER/R-$R_VER-win.exe -OutFile R-win.exe -MaximumRetryCount 3
Expand Down
4 changes: 0 additions & 4 deletions R-package/.gitignore

This file was deleted.

6 changes: 3 additions & 3 deletions R-package/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: RGF
Type: Package
Title: Regularized Greedy Forest
Version: 1.0.6.3
Date: 2019-12-12
Version: 1.0.7
Date: 2021-03-17
Authors@R: c( person("Lampros", "Mouselimis", email = "[email protected]", role = c("aut", "cre")), person("Ryosuke", "Fukatani", role = "cph", comment = "Author of the python wrapper of the 'Regularized Greedy Forest' machine learning algorithm"), person("Nikita", "Titov", role = "cph", comment = "Author of the python wrapper of the 'Regularized Greedy Forest' machine learning algorithm"), person("Tong", "Zhang", role = "cph", comment = "Author of the 'Regularized Greedy Forest' and of the Multi-core implementation of Regularized Greedy Forest machine learning algorithm"), person("Rie", "Johnson", role = "cph", comment = "Author of the 'Regularized Greedy Forest' machine learning algorithm") )
Maintainer: Lampros Mouselimis <[email protected]>
BugReports: https://github.com/RGF-team/rgf/issues
Expand All @@ -21,5 +21,5 @@ Suggests:
rmarkdown
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.0.2
RoxygenNote: 7.1.1
VignetteBuilder: knitr
4 changes: 4 additions & 0 deletions R-package/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
* We've modified the *package.R* file so that messages are printed to the console whenever Python or any of the required modules is not available. Moreover, for the R-package testing the conda environment parameter is adjusted ( this applies to the RGF-team Github repository and not to the CRAN package directly )
* We've modified the *.appveyor.yml* file to return the *artifacts* in order to observe if tests ran successfully ( this applies to the RGF-team Github repository and not to the CRAN package directly )
* We've added tests to increase the code coverage.
* We've dropped support for Python 2.7
* We've fixed also the invalid URL's in the README.md file
* We removed the 'zzz.R' file which included the message: 'Beginning from version 1.0.3 the 'dgCMatrix_2scipy_sparse' function was renamed to 'TO_scipy_sparse' and now accepts either a 'dgCMatrix' or a 'dgRMatrix' as input. The appropriate format for the 'RGF' package in case of sparse matrices is the 'dgCMatrix' format (scipy.sparse.csc_matrix)' as after 4 version updates is no longer required
* We've modified the '.onLoad' function in the 'package.R' file by removing 'reticulate::py_available(initialize = TRUE)' which forces reticulate to initialize Python and gives the following NOTE on CRAN 'Warning in system2(command = python, args = shQuote(config_script), stdout = TRUE, : ..."' had status 2' (see: https://github.com/rstudio/reticulate/issues/730#issuecomment-594365528)


## RGF 1.0.6
Expand Down
26 changes: 15 additions & 11 deletions R-package/R/FastRGF_Classifier.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
#' min_child_weight = 5.0, data_l2 = 2.0,
#' sparse_max_features = 80000,
#' sparse_min_occurences = 5,
#' calc_prob="sigmoid", n_jobs = 1,
#' calc_prob = "sigmoid", n_jobs = 1,
#' verbose = 0)}}{}
#'
#' \item{\code{--------------}}{}
Expand Down Expand Up @@ -89,25 +89,29 @@
#' # min_child_weight = 5.0, data_l2 = 2.0,
#' # sparse_max_features = 80000,
#' # sparse_min_occurences = 5,
#' # calc_prob="sigmoid", n_jobs = 1,
#' # calc_prob = "sigmoid", n_jobs = 1,
#' # verbose = 0)
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("rgf.sklearn")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("rgf.sklearn")) {
#'
#' library(RGF)
#' library(RGF)
#'
#' set.seed(1)
#' x = matrix(runif(100000), nrow = 100, ncol = 1000)
#' set.seed(1)
#' x = matrix(runif(100000), nrow = 100, ncol = 1000)
#'
#' y = sample(1:2, 100, replace = TRUE)
#' y = sample(1:2, 100, replace = TRUE)
#'
#' fast_RGF_class = FastRGF_Classifier$new(max_leaf = 50)
#' fast_RGF_class = FastRGF_Classifier$new(max_leaf = 50)
#'
#' fast_RGF_class$fit(x, y)
#' fast_RGF_class$fit(x, y)
#'
#' preds = fast_RGF_class$predict_proba(x)
#' }
#' preds = fast_RGF_class$predict_proba(x)
#' }
#' }
#' }, silent = TRUE)

FastRGF_Classifier <- R6::R6Class(
"FastRGF_Classifier",
Expand Down
22 changes: 13 additions & 9 deletions R-package/R/FastRGF_Regressor.R
Original file line number Diff line number Diff line change
Expand Up @@ -83,21 +83,25 @@
#' # n_jobs = 1, verbose = 0)
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("rgf.sklearn")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("rgf.sklearn")) {
#'
#' library(RGF)
#' library(RGF)
#'
#' set.seed(1)
#' x = matrix(runif(100000), nrow = 100, ncol = 1000)
#' set.seed(1)
#' x = matrix(runif(100000), nrow = 100, ncol = 1000)
#'
#' y = runif(100)
#' y = runif(100)
#'
#' fast_RGF_regr = FastRGF_Regressor$new(max_leaf = 50)
#' fast_RGF_regr = FastRGF_Regressor$new(max_leaf = 50)
#'
#' fast_RGF_regr$fit(x, y)
#' fast_RGF_regr$fit(x, y)
#'
#' preds = fast_RGF_regr$predict(x)
#' }
#' preds = fast_RGF_regr$predict(x)
#' }
#' }
#' }, silent = TRUE)

FastRGF_Regressor <- R6::R6Class(
"FastRGF_Regressor",
Expand Down
28 changes: 16 additions & 12 deletions R-package/R/RGF_Classifier.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
#' @param memory_policy a character string. One of \emph{"conservative"} (it uses less memory at the expense of longer runtime. Try only when with default value it uses too much memory) or \emph{"generous"} (it runs faster using more memory by keeping the sorted orders of the features on memory for reuse). Memory using policy.
#' @param verbose an integer. Controls the verbosity of the tree building process.
#' @param init_model either NULL or a character string, optional (default=NULL). Filename of a previously saved model from which training should do warm-start. If model has been saved into multiple files, do not include numerical suffixes in the filename. \emph{NOTE:} Make sure you haven't forgotten to increase the value of the max_leaf parameter regarding to the specified warm-start model because warm-start model trees are counted in the overall number of trees.
#' @param filename a character string specifying a valid path to a file where the fitted model should be saved
#' @param filename a character string specifying a valid path to a file where the fitted model should be saved
#' @export
#' @details
#'
Expand All @@ -41,7 +41,7 @@
#' the \emph{feature_importances} function returns the feature importances for the data.
#'
#' the \emph{dump_model} function currently prints information about the fitted model in the console
#'
#'
#' the \emph{save_model} function saves a model to a file from which training can do warm-start in the future.
#'
#' @references \emph{https://github.com/RGF-team/rgf/tree/master/python-package}, \emph{Rie Johnson and Tong Zhang, Learning Nonlinear Functions Using Regularized Greedy Forest}
Expand Down Expand Up @@ -93,7 +93,7 @@
#' \item{\code{dump_model()}}{}
#'
#' \item{\code{--------------}}{}
#'
#'
#' \item{\code{save_model(filename)}}{}
#'
#' \item{\code{--------------}}{}
Expand All @@ -109,21 +109,25 @@
#' # verbose = 0, init_model = NULL)
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("rgf.sklearn")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("rgf.sklearn")) {
#'
#' library(RGF)
#' library(RGF)
#'
#' set.seed(1)
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#' set.seed(1)
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#'
#' y = sample(1:2, 100, replace = TRUE)
#' y = sample(1:2, 100, replace = TRUE)
#'
#' RGF_class = RGF_Classifier$new(max_leaf = 50)
#' RGF_class = RGF_Classifier$new(max_leaf = 50)
#'
#' RGF_class$fit(x, y)
#' RGF_class$fit(x, y)
#'
#' preds = RGF_class$predict_proba(x)
#' }
#' preds = RGF_class$predict_proba(x)
#' }
#' }
#' }, silent = TRUE)

RGF_Classifier <- R6::R6Class(
"RGF_Classifier",
Expand Down
27 changes: 16 additions & 11 deletions R-package/R/RGF_Regressor.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
#' @param memory_policy a character string. One of \emph{"conservative"} (it uses less memory at the expense of longer runtime. Try only when with default value it uses too much memory) or \emph{"generous"} (it runs faster using more memory by keeping the sorted orders of the features on memory for reuse). Memory using policy.
#' @param verbose an integer. Controls the verbosity of the tree building process.
#' @param init_model either NULL or a character string, optional (default=NULL). Filename of a previously saved model from which training should do warm-start. If model has been saved into multiple files, do not include numerical suffixes in the filename. \emph{NOTE:} Make sure you haven't forgotten to increase the value of the max_leaf parameter regarding to the specified warm-start model because warm-start model trees are counted in the overall number of trees.
#' @param filename a character string specifying a valid path to a file where the fitted model should be saved
#' @param filename a character string specifying a valid path to a file where the fitted model should be saved
#' @export
#' @details
#'
Expand All @@ -37,7 +37,7 @@
#' the \emph{feature_importances} function returns the feature importances for the data.
#'
#' the \emph{dump_model} function currently prints information about the fitted model in the console
#'
#'
#' the \emph{save_model} function saves a model to a file from which training can do warm-start in the future.
#'
#' @references \emph{https://github.com/RGF-team/rgf/tree/master/python-package}, \emph{Rie Johnson and Tong Zhang, Learning Nonlinear Functions Using Regularized Greedy Forest}
Expand Down Expand Up @@ -99,21 +99,26 @@
#' # verbose = 0, init_model = NULL)
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("rgf.sklearn")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("rgf.sklearn")) {
#'
#' library(RGF)
#' library(RGF)
#'
#' set.seed(1)
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#' set.seed(1)
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#'
#' y = runif(100)
#' y = runif(100)
#'
#' RGF_regr = RGF_Regressor$new(max_leaf = 50)
#' RGF_regr = RGF_Regressor$new(max_leaf = 50)
#'
#' RGF_regr$fit(x, y)
#' RGF_regr$fit(x, y)
#'
#' preds = RGF_regr$predict(x)
#' }
#' preds = RGF_regr$predict(x)
#' }
#' }
#' }, silent = TRUE)

RGF_Regressor <- R6::R6Class(
"RGF_Regressor",
inherit = Internal_class,
Expand Down
53 changes: 29 additions & 24 deletions R-package/R/TO_scipy_sparse.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,45 +13,50 @@
#' @references https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/dgCMatrix-class.html, https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/dgRMatrix-class.html, https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html#scipy.sparse.csc_matrix
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("scipy")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("scipy")) {
#'
#' if (Sys.info()["sysname"] != 'Darwin') {
#' if (Sys.info()["sysname"] != 'Darwin') {
#'
#' library(RGF)
#' library(RGF)
#'
#'
#' # 'dgCMatrix' sparse matrix
#' #--------------------------
#' # 'dgCMatrix' sparse matrix
#' #--------------------------
#'
#' data = c(1, 0, 2, 0, 0, 3, 4, 5, 6)
#' data = c(1, 0, 2, 0, 0, 3, 4, 5, 6)
#'
#' dgcM = Matrix::Matrix(
#' data = data
#' , nrow = 3
#' , ncol = 3
#' , byrow = TRUE
#' , sparse = TRUE
#' )
#' dgcM = Matrix::Matrix(
#' data = data
#' , nrow = 3
#' , ncol = 3
#' , byrow = TRUE
#' , sparse = TRUE
#' )
#'
#' print(dim(dgcM))
#' print(dim(dgcM))
#'
#' res = TO_scipy_sparse(dgcM)
#' res = TO_scipy_sparse(dgcM)
#'
#' print(res$shape)
#' print(res$shape)
#'
#'
#' # 'dgRMatrix' sparse matrix
#' #--------------------------
#' # 'dgRMatrix' sparse matrix
#' #--------------------------
#'
#' dgrM = as(dgcM, "RsparseMatrix")
#' dgrM = as(dgcM, "RsparseMatrix")
#'
#' print(dim(dgrM))
#' print(dim(dgrM))
#'
#' res_dgr = TO_scipy_sparse(dgrM)
#' res_dgr = TO_scipy_sparse(dgrM)
#'
#' print(res_dgr$shape)
#' }
#' }
#' print(res_dgr$shape)
#' }
#' }
#' }
#' }, silent = TRUE)

TO_scipy_sparse = function(R_sparse_matrix) {

if (inherits(R_sparse_matrix, "dgCMatrix")) {
Expand Down
20 changes: 12 additions & 8 deletions R-package/R/mat_2scipy_sparse.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,24 @@
#' @references https://docs.scipy.org/doc/scipy/reference/sparse.html
#' @examples
#'
#' if (reticulate::py_available() && reticulate::py_module_available("scipy")) {
#' try({
#' if (reticulate::py_available(initialize = TRUE)) {
#' if (reticulate::py_module_available("scipy")) {
#'
#' library(RGF)
#' library(RGF)
#'
#' set.seed(1)
#' set.seed(1)
#'
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#' x = matrix(runif(1000), nrow = 100, ncol = 10)
#'
#' res = mat_2scipy_sparse(x)
#' res = mat_2scipy_sparse(x)
#'
#' print(dim(x))
#' print(dim(x))
#'
#' print(res$shape)
#' }
#' print(res$shape)
#' }
#' }
#' }, silent = TRUE)

mat_2scipy_sparse = function(x, format = 'sparse_row_matrix') {

Expand Down
Loading