Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Fix memory safety issues #9823

Merged
merged 6 commits into from
Dec 2, 2023
Merged

Conversation

david-cortes
Copy link
Contributor

fixes #7246
ref #9810
ref #9817

This PR fixes some remaining memory safety issues:

  • Allocations from R interleaved with allocations from C++ that could leave leaked objects in case of R failures.
  • Lengths using a type of insufficient width that might overflow.

@david-cortes
Copy link
Contributor Author

Added them under the namespace that was introduced in #9816 to make a difference w.r.t. the exported functions. Not sure why they would need to have [[nodiscard]] though.

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work on fixing the memory safety issue and demonstrating how to properly work with R in C. Left some comments in the hope that we can make the code easier to understand for new comers.

} else {
for (size_t i = 0; i < olen; ++i) {
std::stringstream stream;
stream << "booster[" << i <<"]\n" << res[i];
Copy link
Member

@trivialfis trivialfis Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR. We will probably need to convert the indexing for tree dump into 1-based in the future.


struct ErrorWithUnwind : public std::exception {};

void throw_exception_from_R_error(void *unused, Rboolean jump) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate it if you could help add some references and documents for R unwinding with C++. I found your comments in the LGBM issue, which is helpful for providing context. I have some familiarity with continuation-passing, but still trying to wrap my head around how this works.

https://cran.r-project.org/doc/manuals/r-patched/R-exts.html#Condition-handling-and-cleanup-code-1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the only documentation there is.

In case it helps: when a failure occurs, the long jump is triggered from the line that calls R_ContinueUnwind. By that point, all the C++ objects will have been destructed, since that line passes through the 'try' block. Upon throwing an R error, all the R allocations that happened during the .Call part are freed.

Simpler example of this usage can be found here: https://github.com/david-cortes/unwindprotect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation and the example usage. This has been most helpful.

R-package/src/xgboost_R.cc Outdated Show resolved Hide resolved
auto ctx = DMatrixCtx(R_ExternalPtrAddr(handle));
if (!strcmp("group", name)) {
std::vector<unsigned> vec(len);
const int *array_ = INTEGER(array);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems unrelated once #9824 is merged. I will skip this part of review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention: one is not supposed to call R C functions in parallel from different threads. It might not be a problem with regular objects, but for altrep'd classes a call to INTEGER could potentially trigger an R allocation.

Comment on lines +454 to +455
SEXP name_ = PROTECT(Rf_asChar(name));
SEXP val_ = PROTECT(Rf_asChar(val));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great if there's a utility function/class that does this. But the longjmp and macros complicate things and I'm not sure if the following code is actually safe.

class SafeRasChar {
   std::int32_t cnt_{0}
  public:
    SEXP AsChar(SEXP str) {
        auto exp = PROTECT(Rf_asChar(str));
        cnt_ ++;
    }
    ~SafeRasChar () {
        UNPROTECT(cnt_);
    }
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's also an option, but be aware that it doesn't play along with rchk: kalibera/rchk#27

Since it seems rchk is not used in the CI jobs, maybe that's a good idea. I know DuckDB for example uses this type of pattern in their R interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but be aware that it doesn't play along with rchk

Thanks for the reference. Let's not complicate the cran submit process. ;-)

@@ -655,9 +712,19 @@ XGB_DLL SEXP XGBoosterGetAttr_R(SEXP handle, SEXP name) {

XGB_DLL SEXP XGBoosterSetAttr_R(SEXP handle, SEXP name, SEXP val) {
R_API_BEGIN();
const char *v = isNull(val) ? nullptr : CHAR(asChar(val));
const char *v = nullptr;
SEXP name_ = PROTECT(Rf_asChar(name));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous comment for a utility class.

@david-cortes
Copy link
Contributor Author

david-cortes commented Nov 30, 2023

Regarding your other comment:

Is it possible to return errors in a more linear fashion?

It would require modifying the macros R_API_BEGIN, R_API_END, CHECK_CALL; and then modifying functions to be mindful of whether they put objects before/after those. I guess that's more work than the current code here, but an example of how it could work otherwise can be found in lightgbm: https://github.com/microsoft/LightGBM/blob/master/R-package/src/lightgbm_R.cpp

If you want an easier and safer way, there's also Rcpp::exception, Rcpp::unwindProtect, etc. but that would involve having Rcpp as a dependency.

@david-cortes
Copy link
Contributor Author

Then, there's also the possibility of passing R allocator functions to C++ functions, but this is perhaps a lot more involved and harder to remember:
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Memory-allocation

@david-cortes
Copy link
Contributor Author

@trivialfis It looks like the CUDA job uses R 3.3, which is very old so doesn't have the unwind protection that was introduced in R 3.5, for example. Could you update those to R >= 4.3 for this and the following PRs?

@trivialfis
Copy link
Member

Could you update those to R >= 4.3 for this and the following PRs?

I have merged the update into this branch and the test build has passed.

@trivialfis trivialfis merged commit 7196c9d into dmlc:master Dec 2, 2023
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

R memory safety issues
2 participants