Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(R) Functions that replace R headers are not compatible with all R versions #4216

Closed
david-cortes opened this issue Apr 22, 2021 · 3 comments · Fixed by #4247
Closed

(R) Functions that replace R headers are not compatible with all R versions #4216

david-cortes opened this issue Apr 22, 2021 · 3 comments · Fixed by #4247

Comments

@david-cortes
Copy link
Contributor

david-cortes commented Apr 22, 2021

Migrated from here: #4207

The R interface for LightGBM has replacements for R's internal headers which re-define the structs used by R's variables. These underlying structures behind R objects are however not constant between versions, and currently the LighGBM replacements will not work correctly with R 3.6 for windows and mac (which from the requirements is meant to be supported, and is tested in the CI pipelines in github here).

Example:

library(lightgbm)
data(mtcars)
X <- as.matrix(mtcars[,-1])
y = mtcars[,1,drop=TRUE]
m = lightgbm(X, y, params=list(objective="regression", min_data=1))
pred_before = predict(m, X)
mode(X) = "double"
pred_after = predict(m, X)
print(pred_before)
 [1] 20.68292 20.68292 21.85530 20.94345 19.18488 18.79409 16.31907 22.89741 21.85530 19.51054 18.59869 17.68684 18.27303
[14] 16.90526 13.77891 13.77891 16.57959 28.10799 26.80534 29.08497 21.00858 17.10065 16.90526 15.66774 19.51054 24.78625
[27] 23.93953 26.80534 17.29605 19.83620 16.77499 20.94345
print(pred_after)
 [1] 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534
[14] 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534
[27] 26.80534 26.80534 26.80534 26.80534 26.80534 26.80534

(This was tested with R 3.6.3 and RTools35 on windows, but the problem will not be present if tested with R 4.0.4 in linux)

More concretely, what's failing there is the function R_REAL_PTR(lightbm_struct) (meant to obtain a pointer to the array of doubles underneath an R matrix or vector), which does not output the same as the R C function REAL(SEXP), returning instead a pointer to the address of some other array or struct. This not only causes incorrect results but can also lead to segmentation faults.

For now I think an easy solution would be to require R >= 4.0.0, but this is not a real solution as the structs can change at any time as the R headers are not meant to be replaced by libraries.

@jameslamb
Copy link
Collaborator

Thanks very much for the write-up! I installed R 3.6.3 on my Mac tonight (macOS 10.14.6).

I tried your reproducible example, and can confirm that if you call predict() on a double-mode matrix the Predictor produces constant predictions when it shouldn't based on the structure of the model.

It also looks like changing the mode of the training data to double results in no trees being added.

library(lightgbm)
data(mtcars)
X <- as.matrix(mtcars[,-1])
y = mtcars[,1,drop=TRUE]

# train on numeric data
m = lightgbm(X, y, params=list(objective="regression", min_data=1))
modelDT <- lightgbm::lgb.model.dt.tree(m)
head(modelDT)

# train on double data
mode(X) <- "double"
m <- lightgbm(X, y, params=list(objective="regression", min_data=1))
modelDT <- lightgbm::lgb.model.dt.tree(m)
head(modelDT)

@jameslamb
Copy link
Collaborator

Linking #3016

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants