diff --git a/.Rbuildignore b/.Rbuildignore index 2534116..c93ac8c 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -6,6 +6,7 @@ code_ex.R ^\.Rprofile$ ^cran-comments\.md$ ^my-comments\.md$ +^NEWS\.html$ logo.R ^Rscript* logo_large.png \ No newline at end of file diff --git a/DESCRIPTION b/DESCRIPTION index 789ab57..196131a 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -2,7 +2,7 @@ Package: resemble Type: Package Title: Memory-Based Learning in Spectral Chemometrics Version: 2.2.1 -Date: 2022-03-17 +Date: 2022-08-31 Author: Leonardo Ramirez-Lopez [aut, cre], Antoine Stevens [aut, ctb], Claudio Orellano [ctb], @@ -48,5 +48,5 @@ VignetteBuilder: knitr NeedsCompilation: yes LazyData: true Repository: CRAN -RoxygenNote: 7.1.2 +RoxygenNote: 7.2.1 Encoding: UTF-8 diff --git a/NEWS b/NEWS deleted file mode 100644 index 5d668f9..0000000 --- a/NEWS +++ /dev/null @@ -1,244 +0,0 @@ -# News for the resemble package - -## resemble 2.1 (piapia) -23.09.2021 - -### New features -- A argument named "seed" was added to the mbl function. It is used to gurantee -reproducibility of cross-validation results. - -- A modified PLS method was implemented (see local_fit_pls and local_fit_wapls). -It uses correlation bteween response and predictors to derive the PLS weights. - -## Improvements and fixes -- A Bug in the computation of the explained variance of X for pls models was -detected and fixed. The pls related functions were underestimating the amount -of variance exmplained by each PLS component. The explained variance was being -computed from the matrix of scores, in this version it is computed from the -reconstructed spectra at each PLS iteration. - -- Manual selection of components in pc_projection() and pls_projection() now -accepts to select only 1 component (before the minimum was 2). - -- pls_projection() now includes a new argument (method) to allow the user to select -between the standard pls algorithm and a modified pls algorithm. - -- ortho_diss(), dissimilarity() and search_neighbors() functions include a new -dissimilarity method: "mpls" (modified pls). - -- An internal function for stratified random sampling for cross validtaion -purposes has been improved for computational speed. - -- The package was stripping some symbols for Rcpp functions in Makevars in order -to reduce the installation size of the package. Now these lines have been -commented to comply with CRAN policies. - - -## resemble 2.0 (gordillo) - -* 02.07.2020 -During the recent lockdown we had the chance to inevest a enough time on the -development of a new version of the package resemble. This new version comes -with significant improvements as well as new functionality. For example, it now -matches the tidyverse R style guide, it includes unit tests, includes new -functionality for modeling, mbl is faster and a less memory-hungry. - -### New features -- search_neighbors() function. -- dissimilarity() function. - -## Improvements and fixes -- mbl is faster and a less memory-hungry. -- New vignette. -- unit tests have been introduced using the testthat package. - -## Breaking changes - -### orthoProjection, pcProjection, plsProjection (renamed to ortho_projection, -pc_projection, pls_projection respectively): -- X2 argument renamed to Xu (for consistency throughout all the fucntions) -- Argument scaled renamed to .scale -- Argument max.iter renamed to max_iter -- Bug fix: when the "pca.nipals method was used and the method to select the pcs wa "opc", - the function was returning 1 component less than the maximum requested. -- "pca.nipals" is now implemented in C++ via Rcpp -- Bug fix in plsProjection: when "cumvar" was used as the pcSelection method, an - error about data allocation in a matrix was retrieved -- Argument pcSelection renamed to pc_selection -- ... is deprecated in both pcProjection and plsProjection -- Argument cores is deprecated, it was used to set the number of cores in some -c++ internal functions via OpenMP in Rcpp. -- Names the following outputs have been changed: X.loadings, Y.loadings, sc.sdv -and n.components. Their new names are: X_loadings, Y_loadings, sc_sdv and -n_components. - -### corDiss (renamed to cor_diss): -- X2 argument renamed to Xu (for consistency throughout all the fucntions) -- Argument scaled renamed to .scale -- default for .scale has changed from TRUE to FALSE -- the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n) - -### fDiss (renamed to f_diss): -- X2 argument renamed to Xu (for consistency throughout all the fucntions) -- Argument scaled renamed to .scale -- default for .scale has changed from TRUE to FALSE -- the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n) -- argument method changed to diss_method - -### sid: -- X2 argument renamed to Xu (for consistency throughout all the fucntions) -- Argument scaled renamed to .scale -- default for .scale has changed from TRUE to FALSE -- the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n) - - -### orthoDiss (renamed to ortho_diss): -- X2 argument renamed to Xu (for consistency throughout all the fucntions) -- Argument scaled renamed to .scale -- Argument local renamed to .local -- Argument pcSelection renamed to pc_selection -- Argument return.all renamed to compute_all -- Argument cores is deprecated, it wwas used to set the number of cores in some -c++ internal functions via OpenMP in Rcpp. -- When \code{.local = TRUE} a new output is produced: 'neighborhood_info' which -is a data.frame containing the relevant information about the neighborhood of -each sample (e.g. neighborhood indices, number of components used at each -neighborhood, etc) -- Output global.variance.info has been renamed to global_variance_info - - -### simEval (renamed to sim_eval): -- argument sideInf renamed to side_info -- argument lower.tri renamed to lower_triangle -- argument cores renamed to omp_threads -- lower_triangle is deprecated. Now if a vector is passed to d, the function assumes - that it is a lower triangle of a distance matrix -- Argument cores is deprecated, it wwas used to set the number of cores in some -c++ internal functions via OpenMP in Rcpp. - - -### mbl -- pls.max.iter, pls.tol and noise.v were moved to mbl from mbl_control() -- Argument scaled (from mbl_control()) renamed to .scale and moved to mbl -- new arguments: gh and spike -- Argument pcSelection renamed to pc_selection -- Argument mblCtrl renamed to control -- Argument dissUsage renamed to diss_usage -- order of the Yr, Xr, Yu and Xu arguments has changed to Xr, Yr, Xu and Yu -- input type for the argument method has changed. Previously it received a -character string indicating the type of local regresion (i.e. "pls", -"wapls1" or "gpr"). Now it receives an object of class local_fit which is output -by the new local_fit fucntions. -- dissimilarityM has been deprecated. It was used to pass a dissimilarity matrix -computed outiside the mbl fucntion. This can be done now with the new argument -diss_method of mbl which was previosly named "sm" and it was in mblControl() - - -### neigCleaning (now search_neighbors) -- Function renamed to search_neighbors -- Argument cores is deprecated, it was used to set the number of cores in some -c++ internal functions via OpenMP in Rcpp. - - -### mblControl (renamed to mbl_control): -- sm argument is deprecated. Now the dissmilarity metric is an argument of the -mbl fucntion -- scale and center arguments have been moved to the mbl fucntion -- Argument range.pred.lim renamed to range_prediction_limits -- Argument cores is deprecated, it was used to set the number of cores in some -c++ internal functions via OpenMP in Rcpp. -- k0, pcMethod, ghMethod are deprecated -- localOptimization has been renamed to tune_locally -- valMethod has been renamed to validation_type -- Option "loc_crossval" in validation_type has been renamed to "local_cv" - -### plot.mbl -- option "pca" was replaced by option "gh" which plots the pls projection used -for computing the gh distance in mbl - - - -## resemble 1.3 (never released) - -* 11.04.2020 -The option 'movcor' for the argument sm of mblControl() is deprecated. The -'movcor' moving window correlations between spectra as dissimilarity measure. -Now This option can be specified by using 'cor' as the method in the argument -'sm' and passing a window size to the argument 'ws'of mblControl(). If 'ws' -is not specified, the standard correlation between spectra is computed. - -* 27.02.2020 -The argument 'resampling' in mblControl() has been renamed to 'number' - -* 18.07.2019 -A bug in the scaling of the euclidean distances in fDiss was detected and fixed. -The distance ratios (between samples) were correctly calculated, but the final -scaling of the results was not properly done. The distance between Xi and Xj -were scaled by taking the squared root of the mean of the squared differences -and dividing it by the number of variables i.e. sqrt(mean((Xi-Xj)^2))/ncol(Xi), -however the correct calculation is done by taking the mean of the squared -differences, dividing it by the number of variables and then compute the squared -root i.e. sqrt(mean((Xi-Xj)^2)/ncol(Xi)). This bug had no effect on the -computations of the nearest neighbors. - -## resemble 1.2.0001 (alma de coco) -* 13.09.2016 -A bug in the computation of the Mahalanobis distance in the PLS space was fixed. - -* 06.09.2016 -Thanks to Matthieu Lesnoff who found a bug in the predict.orthoProjection -function (an error was thrown when PCA preditions were requested). This bug has -been fixed. - -* 10.08.2016 -A bug in plot.mbl was fixed. It was not possible to plot mbl results when the -k.diss argument (threshold distances) was used in the mbl function. - -* 10.08.2016 -Since the previous release, the "wapls1" regression (in the mbl function) -is actually compatible with valMethod = "loc_crossval" (in the mblControl). -In the previous documentation was wrongly stated otherwise. Now this has been -corrected in the documentation. - -* 09.08.2016 -the projection Matrix (projectionM) returned by plsProjection now only contains -the columns corresponding only to the number components retrieved - -## resemble 1.2 (alma de coco) -* 04.03.2016 -A patch was released for and extrange bug that prevented to run mbl -in parallel when the gpr method was used. - -* 19.01.2016 -Now it is possible to locally optimize the maximum and minimum pls factors in -wapls1 local regressions. - -* 09.12.2015 -Many thanks to Eva Ampe and Lorenzo Menichetti for their suggestions. - -* 08.12.2015 -A method for better estimates of RMSE values computed for the 'wapls1' method -has been implemented. - -* 08.12.2015 -The 'wapls2' method of the mbl function is no longer supported because of several -drawbacks computing reliable uncertainty estimates. - -* 18.11.2015 -Several functions are now based on C++ for faster computations. - -* 23.04.2014 -Added default variable names when they are missing and an error message when the -names of Xr do not match the names of Xu. - -* 23.04.2014 -plot.mbl draws now the circles around the actual centre function when the -spectra is not centred for mbl. - -* 20.03.2013 -The function movcorDist was removed since. it was included by mistake in the -previous version of the package. The corDiss function can be used in -raplacement of movcorDist. - -## resemble 1.1.1 -* 19.03.2013 Hello world! Initial release of the package diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..56a3ec1 --- /dev/null +++ b/NEWS.md @@ -0,0 +1,353 @@ +# `resemble` + + +`resemble 2.2.1 (Fix-Hodges)` +=============== + +### Improvements and fixes + +* Fixed: An error was thrown when passing a pre-computed distance matrix to the +`diss_method` argument in `mbl()` ([#24](https://github.com/l-ramirez-lopez/resemble/issues/24)). + +* Documentation is now compatible with HTML5. + + +`resemble 2.1 (piapia)` +=============== + +23.09.2021 + +### New features + +* The argument `seed` was added to the mbl function. It is used to gurantee +reproducibility of cross-validation results. + +* A modified PLS method was implemented (see `local_fit_pls()` and `local_fit_wapls()`). +It uses correlation bteween response and predictors to derive the PLS weights. + +### Improvements and fixes + +* A Bug in the computation of the explained variance of X for pls models was +detected and fixed. The pls related functions were underestimating the amount +of variance explained by each PLS component. The explained variance was being +computed from the matrix of scores, in this version it is computed from the +reconstructed spectra at each PLS iteration. + +* Manual selection of components in `pc_projection()` and `pls_projection()` now +accepts to select only 1 component (before the minimum was 2). + +* `pls_projection()` now includes a new argument (`method`) to allow the user to select +between the standard pls algorithm and a modified pls algorithm. + +* `ortho_diss()`, `dissimilarity()` and `search_neighbors()` functions include a new +dissimilarity method: `"mpls"` (modified pls). + +* An internal function for stratified random sampling for cross-validtaion +purposes has been improved for computational speed. + +* The package was stripping some symbols for Rcpp functions in Makevars in order +to reduce the installation size of the package. Now these lines have been +commented to comply with CRAN policies. + + +`resemble 2.0 (gordillo)` +=============== + +02.07.2020 + +During the recent lockdown we had the chance to inevest a enough time on the +development of a new version of the package `resemble`. This new version comes +with significant improvements as well as new functionality. For example, it now +matches the tidyverse R style guide, it includes unit tests, includes new +functionality for modeling, mbl is faster and a less memory-hungry. + +### New features + +* `search_neighbors()` function. + +* `dissimilarity()` function. + +### Improvements and fixes + +* `mbl` is faster and a less memory-hungry. + +* New vignette. + +* unit tests have been introduced using the testthat package. + + + +### Breaking changes + +#### `orthoProjection()`, `pcProjection()`, `plsProjection()` (renamed to `ortho_projection()`, +`pc_projection()`, `pls_projection()` respectively): + +* `X2` argument renamed to `Xu` (for consistency throughout all the functions). + +* Argument `scaled` renamed to `.scale`. + +* Argument `max.iter` renamed to `max_iter`. + +* Bug fix: when the `"pca.nipals"`" method was used and the method to select the pcs was `"opc"`, + the function was returning 1 component less than the maximum requested. + +* `"pca.nipals"` is now implemented in C++ via Rcpp. + +* Bug fix in `plsProjection()`: when `"cumvar"` was used as the `pcSelection` method, an + error about data allocation in a matrix was retrieved. + +* Argument `pcSelection` renamed to `pc_selection`. + +* `...` is deprecated in both `pcProjection()` and `plsProjection()`. + +* Argument `cores` is deprecated, it was used to set the number of cores in some +c++ internal functions via OpenMP in Rcpp. + +* Names the following outputs have been changed: `X.loadings`, `Y.loadings`, `sc.sdv` +and `n.components`. Their new names are: `X_loadings`, `Y_loadings`, `sc_sdv` and +`n_components`. + + +#### `corDiss()` (renamed to `cor_diss()`): + +* `X2` argument renamed to `Xu` (for consistency throughout all the functions). + +* Argument `scaled` renamed to `.scale`. + +* default for `.scale` has changed from `TRUE` to `FALSE`. + +* the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n). + + +#### `fDiss()` (renamed to `f_diss()`): + +* `X2` argument renamed to `Xu` (for consistency throughout all the functions). + +* Argument `scaled` renamed to `.scale`. + +* default for `.scale` has changed from `TRUE` to `FALSE`. + +* the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n). + +* argument method changed to diss_method. + + +#### `sid()`: + +* `X2` argument renamed to `Xu` (for consistency throughout all the functions). + +* Argument `scaled` renamed to `.scale`. + +* default for `.scale` has changed from `TRUE` to `FALSE`. + +* the dimnames of the resulting matrix are now Xr_1... Xr_n (previusly Xr.1... Xr.n). + + + +#### orthoDiss (renamed to ortho_diss): + +* `X2` argument renamed to `Xu` (for consistency throughout all the functions). + +* Argument `scaled` renamed to `.scale`. + +* Argument `local` renamed to `.local`. + +* Argument `pcSelection` renamed to `pc_selection`. + +* Argument `return.all` renamed to `compute_all`. + +* Argument `cores` is deprecated, it wwas used to set the number of cores in some +c++ internal functions via OpenMP in Rcpp. + +* When `.local = TRUE` a new output is produced: `neighborhood_info` which +is a data.frame containing the relevant information about the neighborhood of +each sample (e.g. neighborhood indices, number of components used at each +neighborhood, etc). + +* Output `global.variance.info` has been renamed to `global_variance_info` + + +#### `simEval()` (renamed to `sim_eval()`): + +* argument `sideInf` renamed to `side_info`. + +* argument `lower.tri` renamed to `lower_triangle`. + +* argument `cores` renamed to `omp_threads`. + +* `lower_triangle` is deprecated. Now if a vector is passed to d, the function assumes + that it is a lower triangle of a distance matrix. + +* Argument `cores` is deprecated, it was used to set the number of cores in some +c++ internal functions via OpenMP in Rcpp. + + +#### `mbl()` + +* `pls.max.iter`, `pls.tol` and `noise.v` were moved to `mbl()` from `mbl_control()`. + +* Argument scaled (from `mbl_control()`) renamed to .scale and moved to `mbl()`. + +* new arguments: `gh` and `spike`. + +* Argument `pcSelection` renamed to `pc_selection`. + +* Argument `mblCtrl` renamed to `control`. + +* Argument `dissUsage` renamed to `diss_usage`. + +* order of the `Yr`, `Xr`, `Yu` and `Xu` arguments has changed to `Xr`, `Yr`, `Xu` and `Yu`. + +* input type for the argument method has changed. Previously it received a +character string indicating the type of local regresion (i.e. "pls", +"wapls1" or "gpr"). Now it receives an object of class `local_fit` which is output +by the new `local_fit` functions. + +* `dissimilarityM` has been deprecated. It was used to pass a dissimilarity matrix +computed outside the `mbl()` function. This can be done now with the new argument +`diss_method` of `mbl` which was previously named `"sm"` and it was in `mblControl()`. + + +#### `neigCleaning()` (now `search_neighbors()`) + +* Function renamed to `search_neighbors`. + +* Argument `cores` is deprecated, it was used to set the number of cores in some +c++ internal functions via OpenMP in Rcpp. + + +#### `mblControl()` (renamed to `mbl_control()`): + +* `sm` argument is deprecated. Now the dissmilarity metric is an argument of the +mbl function. + +* `scale` and `center` arguments have been moved to the `mbl()` function. + +* Argument `range.pred.lim` renamed to `range_prediction_limits`. + +* Argument `cores` is deprecated, it was used to set the number of cores in some +c++ internal functions via OpenMP in Rcpp. + +* `k0`, `pcMethod`, `ghMethod` are deprecated. + +* `localOptimization` has been renamed to `tune_locally`. + +* `valMethod` has been renamed to `validation_type`. + +* Option `"loc_crossval"` in validation_type has been renamed to `"local_cv"`. + +#### `plot.mbl()` + +* option `"pca"` was replaced by option `"gh"` which plots the pls projection used +for computing the gh distance in `mbl()`. + + + +`resemble 1.3` (never released) +=============== + +11.04.2020 + +The option 'movcor' for the argument sm of mblControl() is deprecated. The +'movcor' moving window correlations between spectra as dissimilarity measure. +Now This option can be specified by using 'cor' as the method in the argument +'sm' and passing a window size to the argument 'ws'of mblControl(). If 'ws' +is not specified, the standard correlation between spectra is computed. + +27.02.2020 + +The argument 'resampling' in mblControl() has been renamed to 'number' + +18.07.2019 + +A bug in the scaling of the euclidean distances in fDiss was detected and fixed. +The distance ratios (between samples) were correctly calculated, but the final +scaling of the results was not properly done. The distance between Xi and Xj +were scaled by taking the squared root of the mean of the squared differences +and dividing it by the number of variables i.e. sqrt(mean((Xi-Xj)^2))/ncol(Xi), +however the correct calculation is done by taking the mean of the squared +differences, dividing it by the number of variables and then compute the squared +root i.e. sqrt(mean((Xi-Xj)^2)/ncol(Xi)). This bug had no effect on the +computations of the nearest neighbors. + +`resemble 1.2.0001 (alma de coco)` +=============== + +13.09.2016 + +A bug in the computation of the Mahalanobis distance in the PLS space was fixed. + +06.09.2016 + +Thanks to Matthieu Lesnoff who found a bug in the predict.orthoProjection +function (an error was thrown when PCA preditions were requested). This bug has +been fixed. + +10.08.2016 + +A bug in plot.mbl was fixed. It was not possible to plot mbl results when the +k.diss argument (threshold distances) was used in the mbl function. + +10.08.2016 + +Since the previous release, the "wapls1" regression (in the mbl function) +is actually compatible with valMethod = "loc_crossval" (in the mblControl). +In the previous documentation was wrongly stated otherwise. Now this has been +corrected in the documentation. + +09.08.2016 + +the projection Matrix (projectionM) returned by plsProjection now only contains +the columns corresponding only to the number components retrieved + +`resemble 1.2 (alma de coco)` +=============== + +04.03.2016 + +A patch was released for and extrange bug that prevented to run mbl +in parallel when the gpr method was used. + +19.01.2016 + +Now it is possible to locally optimize the maximum and minimum pls factors in +wapls1 local regressions. + +09.12.2015 + +Many thanks to Eva Ampe and Lorenzo Menichetti for their suggestions. + +08.12.2015 + +A method for better estimates of RMSE values computed for the 'wapls1' method +has been implemented. + +08.12.2015 + +The 'wapls2' method of the mbl function is no longer supported because of several +drawbacks computing reliable uncertainty estimates. + +18.11.2015 + +Several functions are now based on C++ for faster computations. + +23.04.2014 + +Added default variable names when they are missing and an error message when the +names of Xr do not match the names of Xu. + +23.04.2014 + +plot.mbl draws now the circles around the actual centre function when the +spectra is not centred for mbl. + +20.03.2013 + +The function movcorDist was removed since. it was included by mistake in the +previous version of the package. The corDiss function can be used in +raplacement of movcorDist. + +`resemble 1.1.1` +=============== + +19.03.2013 Hello world! Initial release of the package diff --git a/R/mbl.R b/R/mbl.R index 5730bf3..313a6f8 100644 --- a/R/mbl.R +++ b/R/mbl.R @@ -928,6 +928,7 @@ mbl <- function(Xr, Yr, Xu, Yu = NULL, allow_parallel = control$allow_parallel, ... ) + diss_xr_xu <- neighborhoods$dissimilarity if (!is.null(neighborhoods$projection)) { diss_xr_xu_projection <- neighborhoods$projection @@ -946,9 +947,7 @@ mbl <- function(Xr, Yr, Xu, Yu = NULL, )) } diss_xr_xr <- diss_method[1:nrow(Xr), 1:nrow(Xr)] - diss_xr_xu <- diss_method[1:nrow(Xr), (1 + nrow(Xr)):ncol(diss_method)] - rm(diss_method) - gc() + diss_method <- diss_method[1:nrow(Xr), (1 + nrow(Xr)):ncol(diss_method)] } if (diss_usage %in% c("weights", "none")) { if (dim_diss[1] != n_xr & dim_diss[2] != n_xu) { @@ -962,17 +961,19 @@ mbl <- function(Xr, Yr, Xu, Yu = NULL, } } diss_xr_xu <- diss_method - append( - neighborhoods, - diss_to_neighbors(diss_xr_xu, - k = k, k_diss = k_diss, k_range = k_range, - spike = NULL, + diss_method <- "external_matrix" + + + neighborhoods <- + diss_to_neighbors( + diss_xr_xu, + k = k_max, k_diss = k_diss_max, + k_range = k_range, + spike = spike, return_dissimilarity = control$return_dissimilarity ) - ) - + if (gh) { - neighborhoods <- NULL neighborhoods$gh$projection <- pls_projection( Xr = Xr, Xu = Xu, Yr = Yr, @@ -980,7 +981,7 @@ mbl <- function(Xr, Yr, Xu, Yu = NULL, scale = scale, ... ) neighborhoods$gh$gh_Xr <- f_diss(neighborhoods$gh$projection$scores, - Xu = t(colMeans(neighborhoods$gh$projection$scores)), + Xu = t(colMeans(neighborhoods$gh$projection$scores[1:nrow(Xr), ])), diss_method = "mahalanobis", center = FALSE, scale = FALSE ) @@ -991,7 +992,6 @@ mbl <- function(Xr, Yr, Xu, Yu = NULL, neighborhoods$diss_xr_xr <- diss_xr_xr rm(diss_xr_xr) - rm(diss_method) gc() } diff --git a/R/resemble.R b/R/resemble.R index 1666303..a1c4f67 100644 --- a/R/resemble.R +++ b/R/resemble.R @@ -18,7 +18,7 @@ #' #' Functions for memory-based learning #' -#' \if{html}{\figure{logo.png}{options: align='right' alt='logo' width='120'}} +#' \if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} #' #' @details #' diff --git a/R/search_neighbors.R b/R/search_neighbors.R index bce00c6..958bcbf 100644 --- a/R/search_neighbors.R +++ b/R/search_neighbors.R @@ -150,7 +150,7 @@ #' input data). Default: \code{character()}. NOTE: his is an experimental #' argument. #' @param ... further arguments to be passed to the \code{\link{dissimilarity}} -#' fucntion. See details. +#' function. See details. #' @details #' This function may be specially useful when the reference set (\code{Xr}) is #' very large. In some cases the number of observations in the reference set diff --git a/man/resemble-package.Rd b/man/resemble-package.Rd index 84111dc..2ecc12a 100644 --- a/man/resemble-package.Rd +++ b/man/resemble-package.Rd @@ -10,7 +10,7 @@ Functions for memory-based learning -\if{html}{\figure{logo.png}{options: align='right' alt='logo' width='120'}} +\if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} } \details{ This is the version \code{2.1} (\code{'piapia'}) of the package. It diff --git a/man/search_neighbors.Rd b/man/search_neighbors.Rd index f4edb0b..764370b 100644 --- a/man/search_neighbors.Rd +++ b/man/search_neighbors.Rd @@ -166,7 +166,7 @@ input data). Default: \code{character()}. NOTE: his is an experimental argument.} \item{...}{further arguments to be passed to the \code{\link{dissimilarity}} -fucntion. See details.} +function. See details.} } \value{ a \code{list} containing the following elements: diff --git a/my-comments.md b/my-comments.md index d9854e1..8a20731 100644 --- a/my-comments.md +++ b/my-comments.md @@ -1,5 +1,83 @@ # resemble +# version 2.2.1 + +# submission message: +Dear CRAN maintainers, +I am submitting my package "resemble" to CRAN. This version accounts for +problems found in Rd files auto-generated with roxygen2 7.1.2 (not compatible +with HTML5). The new Rd files are now compatible with HTML5 (as Rd files +are generated with roxygen2_7.2.0 ). +Prior to this submission, this tarball has been checked with in the winbuilder service. Apart from that it has been also submitted to extensive tests in rhub. +A first submission of this version failed (for "r-devel-linux-x86_64-debian-gcc"), +therefore following platforms were tested for a second submission using Rhub: +- Debian Linux, R-devel, GCC ASAN/UBSAN +- Debian Linux, R-devel, GCC, no long double +- Debian Linux, R-devel, clang, ISO-8859-15 locale +- Debian Linux, R-devel, GCC +For this second submission the package passed all the tests in the above platforms. +Reverse dependencies have also been checked. +Best regards, +Leonardo + + + +## Package was built using: +``` +devtools::build( + pkg = ".", + path = NULL, + binary = FALSE, + vignettes = TRUE, + manual = TRUE, + args = NULL, + quiet = FALSE +) +``` + +# R win builder checks for release of `resemble 2.2.1` (`Fix-Hodges`) 30.08.2022 +passed all the checks without notes. + +# Rhub checks for release of `resemble 2.2.1` (`Fix-Hodges`) 30.08.2022 +The checks were conducted in the following platforms through rhub: + +``` +rhub::check(paste0(gsub("/resemble$", "/", getwd()), "resemble_2.2.1.tar.gz"), + platform = c("fedora-gcc-devel"), + email = "ramirez.lopez.leo@gmail.com") +``` +- "linux-x86_64-rocker-gcc-san" OK + +- "fedora-gcc-devel" NOTE + installed size is 11.7Mb + sub-directories of 1Mb or more: + doc 1.6Mb + libs 9.5Mb + +- "windows-x86_64-devel" OK + +- "macos-highsierra-release-cran" OK + +- "windows-x86_64-release" OK + +- "ubuntu-gcc-release" NOTE + installed size is 13.5Mb + sub-directories of 1Mb or more: + doc 1.6Mb + libs 11.3Mb + + +- "solaris-x86-patched-ods" Package suggested but not available: ‘testthat’ + + The suggested packages are required for a complete check. + Checking can be attempted without them by setting the environment + variable _R_CHECK_FORCE_SUGGESTS_ to a false value. + + See section ‘The DESCRIPTION file’ in the ‘Writing R Extensions’ + manual. + + + # version 2.1.1 ## Package was built using: diff --git a/src/regression_methods.cpp b/src/regression_methods.cpp index 277c14f..701293f 100644 --- a/src/regression_methods.cpp +++ b/src/regression_methods.cpp @@ -1159,7 +1159,6 @@ List opls_get_basics(arma::mat X, ); } - //' @title Prediction function for the \code{opls} and \code{fopls} functions //' @description Predicts response values based on a model generated by either by \code{opls} or the \code{fopls} functions. //' For internal use only!. diff --git a/tests/testthat/test-mbl.R b/tests/testthat/test-mbl.R index 5a8609f..15b2425 100644 --- a/tests/testthat/test-mbl.R +++ b/tests/testthat/test-mbl.R @@ -355,3 +355,77 @@ test_that("mbl delivers expeted results", { expect_true(all(yuv_pls_k_diss)) expect_true(all(yuv_wapls_k_diss)) }) + + +test_that("mbl with external disstances works", { + tol <- 1e-10 + nirdata <- data("NIRsoil", package = "prospectr") + + # Proprocess the data using detrend plus first derivative with Savitzky and + # Golay smoothing filter + sg_det <- savitzkyGolay( + detrend(NIRsoil$spc, + wav = as.numeric(colnames(NIRsoil$spc)) + ), + m = 1, + p = 1, + w = 7 + ) + + NIRsoil$spc_pr <- sg_det + + # split into training and testing sets + test_x <- NIRsoil$spc_pr[NIRsoil$train == 0 & !is.na(NIRsoil$CEC), ] + test_y <- NIRsoil$CEC[NIRsoil$train == 0 & !is.na(NIRsoil$CEC)] + + train_y <- NIRsoil$CEC[NIRsoil$train == 1 & !is.na(NIRsoil$CEC)] + train_x <- NIRsoil$spc_pr[NIRsoil$train == 1 & !is.na(NIRsoil$CEC), ] + + my_control <- mbl_control(validation_type = "NNv") + + ## The neighborhood sizes to test + ks <- seq(40, 140, by = 20) + + ext_d <- dissimilarity( + rbind(train_x, test_x), Xu = rbind(train_x, test_x), + diss_method = "cor", + center = FALSE, scale = FALSE + )$dissimilarity + + dim(ext_d) + diag(ext_d) <- 0 + + + sbl_external_diss <- mbl( + Xr = train_x, + Yr = train_y, + Xu = test_x, + k = ks, + spike = 1:5, + method = local_fit_gpr(), + diss_method = ext_d, + diss_usage = "predictors", + control = my_control, + scale = FALSE, + center = FALSE + ) + + sbl_internal_diss <- mbl( + Xr = train_x, + Yr = train_y, + Xu = test_x, + k = ks, + spike = 1:5, + method = local_fit_gpr(), + diss_method = "cor", + diss_usage = "predictors", + control = my_control, + scale = FALSE, + center = FALSE + ) + + r_ext <- sbl_internal_diss$validation_results$nearest_neighbor_validation + r_int <- sbl_external_diss$validation_results$nearest_neighbor_validation + + expect_true(sum(abs(r_ext - r_int)) < tol) +})