-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
downscaleR one process use too many CPU and TOO SLOW, WHY? #84
Comments
Hi, Could you please share with us the dimensions of the 'xs$Data' and 'wsobs$Data'? We suggest setting model.verbose = FALSE for saving memory space, when using a GLM (type ?glm.train in the R console). This can be included in downscaleCV as an additional argument to the function: downscaleCV(...,model.verbose = FALSE) Please let is know if this improves the speed of the calculus, Cheers, Jorge |
In my code the downscaleCV runs extreamly slow.
HERE IS MY CODE:
options(java.parameters = "-Xmx8g")
library(climate4R.UDG)
library(loadeR)
library(loadeR.2nc)
library(transformeR)
library(climate4R.datasets)
library(downscaleR)
library(visualizeR)
library(VALUE)
library(climate4R.value)
vars <- c("var151","var165","var166") #psl; uas; vas
varp <- ***@***.******@***.******@***.***") #131-ua; 132-va; 130-ta; 129-zg;
grid.list <- lapply(vars, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
grid.listp <- lapply(varp, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_pressure_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
xs <- makeMultiGrid(grid.list)
xp <- makeMultiGrid(grid.listp)
wsobs <- loadGridData(dataset = "/home/inspur/working/climate4r/CCMP/box_CCMP_1990_2018_ws.nc", var = "ws")
pred <- downscaleCV(xs, wsobs, folds = 3, sampling.strategy = "kfold.chronological",
scaleGrid.args = list(type = "standardize"),
method = "GLM",
ncores = 20,
prepareData.args = list(
"spatial.predictors" = list(which.combine = getVarNames(xs), v.exp = 0.9)))
pred.p <- downscaleCV(xp, wsobs, folds = 3, sampling.strategy = "kfold.chronological",
scaleGrid.args = list(type = "standardize"),
method = "GLM",
ncores = 20,
prepareData.args = list(
"spatial.predictors" = list(which.combine = getVarNames(xp), v.exp = 0.9)))
To speed up I added argument: ncores = 20, but seems useless. DownscaleCV function will take me 3 hours or more;
HERE is my data structure:
str(xs)
List of 4
$ Variable:List of 2
..$ varName: chr [1:3] "var151" "var165" "var166"
..$ level : logi [1:3] NA NA NA
..- attr(*, "use_dictionary")= chr [1:3] "FALSE" "FALSE" "FALSE"
..- attr(*, "units")= chr [1:3] "" "" ""
..- attr(*, "longname")= chr [1:3] "var151" "var165" "var166"
..- attr(*, "daily_agg_cellfun")= chr [1:3] "none" "none" "none"
..- attr(*, "monthly_agg_cellfun")= chr [1:3] "none" "none" "none"
..- attr(*, "verification_time")= chr [1:3] "none" "none" "none"
$ Data : num [1:3, 1, 1:42368, 1:53, 1:40] 1.01e+05 -4.18e-01 1.86 1.01e+05 -1.46 ...
..- attr(*, "dimensions")= chr [1:5] "var" "member" "time" "lat" ...
$ xyCoords:List of 2
..$ x: num [1:40] 100 101 102 103 104 ...
..$ y: num [1:53] 10.5 11.2 12 12.8 13.5 ...
..- attr(*, "projection")= chr "LatLonProjection"
..- attr(*, "resX")= num 0.75
..- attr(*, "resY")= num 0.75
$ Dates :List of 3
..$ :List of 2
.. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
.. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
..$ :List of 2
.. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
.. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
..$ :List of 2
.. ..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
.. ..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
- attr(*, "dataset")= chr "/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc"
- attr(*, "R_package_desc")= chr "loadeR-v1.7.0"
- attr(*, "R_package_URL")= chr "https://github.com/SantanderMetGroup/loadeR"
- attr(*, "R_package_ref")= chr https://doi.org/10.1016/j.envsoft.2018.09.009
str(wsobs)
List of 4
$ Variable:List of 2
..$ varName: chr "ws"
..$ level : NULL
..- attr(*, "use_dictionary")= logi FALSE
..- attr(*, "units")= chr ""
..- attr(*, "longname")= chr "ws"
..- attr(*, "daily_agg_cellfun")= chr "none"
..- attr(*, "monthly_agg_cellfun")= chr "none"
..- attr(*, "verification_time")= chr "none"
$ Data : num [1:42368, 1:160, 1:120] 4.01 3.01 4.6 4.91 6.64 ...
..- attr(*, "dimensions")= chr [1:3] "time" "lat" "lon"
$ xyCoords:List of 2
..$ x: num [1:120] 100 100 101 101 101 ...
..$ y: num [1:160] 10.1 10.4 10.6 10.9 11.1 ...
..- attr(*, "projection")= chr "LatLonProjection"
..- attr(*, "resX")= num 0.25
..- attr(*, "resY")= num 0.25
$ Dates :List of 2
..$ start: chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
..$ end : chr [1:42368] "1990-01-01 00:00:00 GMT" "1990-01-01 06:00:00 GMT" "1990-01-01 12:00:00 GMT" "1990-01-01 18:00:00 GMT" ...
- attr(*, "dataset")= chr "/home/inspur/working/climate4r/CCMP/box_CCMP_1990_2018_ws.nc"
- attr(*, "R_package_desc")= chr "loadeR-v1.7.0"
- attr(*, "R_package_URL")= chr "https://github.com/SantanderMetGroup/loadeR"
- attr(*, "R_package_ref")= chr https://doi.org/10.1016/j.envsoft.2018.09.009
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
options(java.parameters = "-Xmx8g")
library(climate4R.UDG)
library(loadeR)
library(loadeR.2nc)
library(transformeR)
library(climate4R.datasets)
library(downscaleR)
library(visualizeR)
library(VALUE)
library(climate4R.value)
vars <- c("var151","var165","var166") #psl; uas; vas
varp <- c("var131@85000","var132@85000","var129@50000") #131-ua; 132-va; 130-ta; 129-zg;
grid.list <- lapply(vars, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_surface_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
grid.listp <- lapply(varp, function(x) {
loadGridData(dataset =
"/home/inspur/working/climate4r/ERA-I/box_pressure_interim_1979_2018.nc",
var = x,
years = 1990:2018)
}
)
pred <- downscaleCV(xs, wsobs, folds = 3, sampling.strategy = "kfold.chronological",
scaleGrid.args = list(type = "standardize"),
method = "GLM",
prepareData.args = list(
"spatial.predictors" = list(which.combine = getVarNames(xs), v.exp = 0.9)))
It is very shocking that the downscaleCV method uses 12603% of one CPU and TOO SLOW why?
A 30*40 box of ERA-I dataset was used to the downscaling dataset is small enough why take so many resources???
here is the cenos7 top result:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
553757 inspur 20 0 105.6g 74.1g 28200 R 12603 7.4 440:22.97 R
The text was updated successfully, but these errors were encountered: