-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] tests and examples should not use more than 2 threads #5102
Comments
I think we can determine whether we are in a CI environment or at CRAN and set
I only can't find for Azure Pipelines (need investigation): |
Thanks for those links! But since the environment variable Maybe I'm just being paranoid, I've just developed this habit of trying to minimize as much as possible assumptions about what CRAN will do or what their check environment will look like. |
Haha, OK! 😄 |
I was going to add |
Hmm seems like that is expected https://stackoverflow.com/q/27319619/19410760 and what would be required for that to work is something like what data.table does https://www.rdocumentation.org/packages/data.table/versions/1.14.2/topics/setDTthreads |
Yeah there is some interesting sourcing/loading of R code done by
What They allocate a "data.table-specific number of threads" variable in the process. And then And then all OpenMP code in the package references that, instead of using https://github.com/Rdatatable/data.table/blob/9e6e45301ea89227414a4f6df1ffc679c5c7ef1c/src/cj.c#L23 Instead of doing that, LightGBM modifies For #4705, LightGBM should be probably be modified to do something like what
Maybe for now, we could use R's Like this in options("lightgbm.num.testing.threads" = 2) And then adding something like the following near the beginning of Line 228 in 44fe591
threads_from_opts <- options()[["lightgbm.num.testing.threads"]]
if (!is.null(threads_from_opts)) {
return (as.integer(threads_from_opts))
} I see mentions in https://cran.r-project.org/doc/manuals/r-release/R-exts.html that encourage the use of I think that might work more reliably with For examples in docs, I think it would be fine to just explicitly pass Could you try that? |
I think that doesn't work because |
hmmmm ok, and I realize now that even for LightGBM/R-package/R/lightgbm.R Lines 162 to 164 in 44fe591
What about adding instead something like the following in # LightGBM-internal fix to comply with CRAN policy of only using up to 2 threads in tests and example
threads_from_opts <- options()[["lightgbm.num.testing.threads"]]
if (!is.null(threads_from_opts)) {
params[["num_threads"]] <- threads_from_opts
} And now that I see it...maybe we should use a name that makes it clearer that this isn't intended to be used by users. Like I think that could work as a minimally-invasive change that could be reverted once #4705 is addressed. |
Based on #5367 (comment), I am going to pick this up. Assigning it to myself to claim it so no one else spends time on it. |
Hey @jameslamb. I was thinking that since we can't set system("OMP_NUM_THREADS=1 Rscript _testthat.R") and tried different values for the variable (by calling This is way too hacky and may not work but we could maybe try something in that direction. |
Interesting idea @jmoralez ! I hadn't considered that. I would expect it to work in local testing, but I don't think we should pursue it as a way to satisfy CRAN.
I think we should pursue more permanent solutions, on the C++ side of LightGBM, to more tightly control the number of threads used. |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
CRAN's submission policies at https://cran.r-project.org/web/packages/policies.html include the following guidance
Currently, most of
{lightgbm}
's examples and tests do not explicitly set the number of threads to use, which means LightGBM defaults to using whatever the result ofomp_get_num_threads()
is.LightGBM/include/LightGBM/utils/openmp_wrapper.h
Lines 30 to 34 in 60e72d5
Work to be done
To ensure that
{lightgbm}
is always respectful of the CRAN check farm, all of its examples and tests should be changed to default to using 2 threads when run on CRAN.Similar to the work done for #4862
LightGBM/R-package/tests/testthat/test_lgb.Booster.R
Lines 1 to 3 in 820ae7e
this should be done in a way that allows LightGBM's CI to override that behavior and use all available CPUs in its CI environments (which might sometimes be more than 2).
References
Opened as a result of #4972 (comment).
The text was updated successfully, but these errors were encountered: