downscaleR one process use too many CPU and TOO SLOW, WHY? #84

louwangzhiyuwhy · 2021-10-15T09:33:30Z

options(java.parameters = "-Xmx8g")

library(climate4R.UDG)
library(loadeR)
library(loadeR.2nc)
library(transformeR)
library(climate4R.datasets)
library(downscaleR)
library(visualizeR)
library(VALUE)
library(climate4R.value)

vars <- c("var151","var165","var166") #psl; uas; vas
varp <- c("var131@85000","var132@85000","var129@50000") #131-ua; 132-va; 130-ta; 129-zg;
grid.list <- lapply(vars, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
grid.listp <- lapply(varp, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_pressure_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
pred <- downscaleCV(xs, wsobs, folds = 3, sampling.strategy = "kfold.chronological",
scaleGrid.args = list(type = "standardize"),
method = "GLM",
prepareData.args = list(
"spatial.predictors" = list(which.combine = getVarNames(xs), v.exp = 0.9)))

It is very shocking that the downscaleCV method uses 12603% of one CPU and TOO SLOW why?
A 30*40 box of ERA-I dataset was used to the downscaling dataset is small enough why take so many resources???
here is the cenos7 top result:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
553757 inspur 20 0 105.6g 74.1g 28200 R 12603 7.4 440:22.97 R

jorgebanomedina · 2021-10-15T14:07:27Z

Hi,

Could you please share with us the dimensions of the 'xs$Data' and 'wsobs$Data'? We suggest setting model.verbose = FALSE for saving memory space, when using a GLM (type ?glm.train in the R console). This can be included in downscaleCV as an additional argument to the function: downscaleCV(...,model.verbose = FALSE)

Please let is know if this improves the speed of the calculus,

Cheers,

Jorge

louwangzhiyuwhy · 2021-10-15T15:12:54Z

In my code the downscaleCV runs extreamly slow. HERE IS MY CODE: options(java.parameters = "-Xmx8g") library(climate4R.UDG) library(loadeR) library(loadeR.2nc) library(transformeR) library(climate4R.datasets) library(downscaleR) library(visualizeR) library(VALUE) library(climate4R.value) vars <- c("var151","var165","var166") #psl; uas; vas varp <- ***@***.******@***.******@***.***") #131-ua; 132-va; 130-ta; 129-zg; grid.list <- lapply(vars, function(x) { loadGridData(dataset = "/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc", var = x, years = 1990:2018) } ) grid.listp <- lapply(varp, function(x) { loadGridData(dataset = "/home/inspur/working/climate4r/ERA-I/box_pressure_interim_1979_2018.nc", var = x, years = 1990:2018) } ) xs <- makeMultiGrid(grid.list) xp <- makeMultiGrid(grid.listp) wsobs <- loadGridData(dataset = "/home/inspur/working/climate4r/CCMP/box_CCMP_1990_2018_ws.nc", var = "ws") pred <- downscaleCV(xs, wsobs, folds = 3, sampling.strategy = "kfold.chronological", scaleGrid.args = list(type = "standardize"), method = "GLM", ncores = 20, prepareData.args = list( "spatial.predictors" = list(which.combine = getVarNames(xs), v.exp = 0.9))) pred.p <- downscaleCV(xp, wsobs, folds = 3, sampling.strategy = "kfold.chronological", scaleGrid.args = list(type = "standardize"), method = "GLM", ncores = 20, prepareData.args = list( "spatial.predictors" = list(which.combine = getVarNames(xp), v.exp = 0.9))) To speed up I added argument: ncores = 20, but seems useless. DownscaleCV function will take me 3 hours or more; HERE is my data structure:

str(xs)

List of 4 $ Variable:List of 2 ..$ varName: chr [1:3] "var151" "var165" "var166" ..$ level : logi [1:3] NA NA NA ..- attr(*, "use_dictionary")= chr [1:3] "FALSE" "FALSE" "FALSE" ..- attr(*, "units")= chr [1:3] "" "" "" ..- attr(*, "longname")= chr [1:3] "var151" "var165" "var166" ..- attr(*, "daily_agg_cellfun")= chr [1:3] "none" "none" "none" ..- attr(*, "monthly_agg_cellfun")= chr [1:3] "none" "none" "none" ..- attr(*, "verification_time")= chr [1:3] "none" "none" "none" $ Data : num [1:3, 1, 1:42368, 1:53, 1:40] 1.01e+05 -4.18e-01 1.86 1.01e+05 -1.46 ... ..- attr(*, "dimensions")= chr [1:5] "var" "member" "time" "lat" ... $ xyCoords:List of 2 ..$ x: num [1:40] 100 101 102 103 104 ... ..$ y: num [1:53] 10.5 11.2 12 12.8 13.5 ... ..- attr(*, "projection")= chr "LatLonProjection" ..- attr(*, "resX")= num 0.75 ..- attr(*, "resY")= num 0.75 $ Dates :List of 3 ..$ :List of 2 .. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... .. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... ..$ :List of 2 .. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... .. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... ..$ :List of 2 .. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... .. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... - attr(*, "dataset")= chr "/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc" - attr(*, "R_package_desc")= chr "loadeR-v1.7.0" - attr(*, "R_package_URL")= chr "https://github.com/SantanderMetGroup/loadeR" - attr(*, "R_package_ref")= chr https://doi.org/10.1016/j.envsoft.2018.09.009

str(wsobs)

List of 4 $ Variable:List of 2 ..$ varName: chr "ws" ..$ level : NULL ..- attr(*, "use_dictionary")= logi FALSE ..- attr(*, "units")= chr "" ..- attr(*, "longname")= chr "ws" ..- attr(*, "daily_agg_cellfun")= chr "none" ..- attr(*, "monthly_agg_cellfun")= chr "none" ..- attr(*, "verification_time")= chr "none" $ Data : num [1:42368, 1:160, 1:120] 4.01 3.01 4.6 4.91 6.64 ... ..- attr(*, "dimensions")= chr [1:3] "time" "lat" "lon" $ xyCoords:List of 2 ..$ x: num [1:120] 100 100 101 101 101 ... ..$ y: num [1:160] 10.1 10.4 10.6 10.9 11.1 ... ..- attr(*, "projection")= chr "LatLonProjection" ..- attr(*, "resX")= num 0.25 ..- attr(*, "resY")= num 0.25 $ Dates :List of 2 ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ... - attr(*, "dataset")= chr "/home/inspur/working/climate4r/CCMP/box_CCMP_1990_2018_ws.nc" - attr(*, "R_package_desc")= chr "loadeR-v1.7.0" - attr(*, "R_package_URL")= chr "https://github.com/SantanderMetGroup/loadeR" - attr(*, "R_package_ref")= chr https://doi.org/10.1016/j.envsoft.2018.09.009

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

downscaleR one process use too many CPU and TOO SLOW, WHY? #84

downscaleR one process use too many CPU and TOO SLOW, WHY? #84

louwangzhiyuwhy commented Oct 15, 2021

jorgebanomedina commented Oct 15, 2021

louwangzhiyuwhy commented Oct 15, 2021 via email •

edited

Loading

downscaleR one process use too many CPU and TOO SLOW, WHY? #84

downscaleR one process use too many CPU and TOO SLOW, WHY? #84

Comments

louwangzhiyuwhy commented Oct 15, 2021

jorgebanomedina commented Oct 15, 2021

louwangzhiyuwhy commented Oct 15, 2021 via email • edited Loading

louwangzhiyuwhy commented Oct 15, 2021 via email •

edited

Loading