-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Destruct dmlc::Error
objects before long jumps
#9916
Conversation
We can have asan on CI. Let me do some tests to see the leak in action, I would like to have some hands-on experience with it. |
Note that you will also need to build R with support for ASAN in order to load a library that uses it. Easiest way is to use one of the images from https://github.com/wch/r-debug |
We've had success with that approach in LightGBM, here's the GitHub Actions config we use: In LightGBM, we saw that exactly replicate errors reported during CRAN's sanitizer checks. |
Thank you for sharing! I have run asan using Details on how I ran the asan tests, modify the R global config to have asan enabled, then preload the asan shared lib before running tests.
|
I ran some small tests and can indeed confirm that ASAN didn't show any leaks, but valgrind does show them as expected: library(xgboost)
set.seed(123)
x <- matrix(rnorm(500), nrow = 50)
y <- rnorm(400)
dim(y) <- c(50, 4, 2)
dm = xgb.DMatrix(data = x, label = y)
|
Thank you for sharing! I will run it, hopefully can get a backtrace. Do you think it's practical to write some tests for checking memory leaks, tests that would fail under asan environment without your recent fixes (like #9823), Valgrind is quite time-consuming on CI. |
I don't think it'd be practical - we'd have to start adding code with conditional compilation for that specific job that would for example manually throw exceptions like Plus, there's no guarantee that they'd be catched by ASAN even without the fixes. |
ref #9810
As a follow-up from #9903
I just noticed that the custom exception class
dmlc::Error
is subcblassed fromstd::runtime_error
, which is in turn subclassed fromstd::exception
:https://github.com/dmlc/dmlc-core/blob/ea21135fbb141ae103fb5fc960289b5601b468f2/include/dmlc/logging.h#L34
Since these objects hold dynamically-allocated memory for the error message, they need to be destructed before doing a long jump, which is how R handles errors.
Currently, I think this type of error can only be thrown by the few lines in the R interface which explicitly call
LOG()
, as otherwise the xgboost C functions signal errors by return codes; and in many cases the conditions leading to theseLOG()
calls would have been checked beforehand in R code, so I'm not sure if it's even possible to end up in thatcatch
clause, but this should prevent any potential leaks if that happens.By the way, it would be quite helpful to set up an optional CI job running the R tests and examples with either ASAN or valgrind, as then these types of leaks could be detected more easily.