Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conflict with conflicted library, remove uses of apropos()? #1141

Open
emstruong opened this issue Oct 16, 2024 · 5 comments
Open

Conflict with conflicted library, remove uses of apropos()? #1141

emstruong opened this issue Oct 16, 2024 · 5 comments

Comments

@emstruong
Copy link

emstruong commented Oct 16, 2024

Hello,

It appears that Rstan's use of the apropos() function is causing conflicts with the conflicted library, so I was advised to open an issue here to see if all uses of apropos() could be removed?

It doesn't seem like changing apropos() is feasible and it's not clear if it would be a good idea or even possible to change the conflicted library to make things work. Also, conflicted seems like a really important package for computational reproducibility and considering R's nature.

Please see the attached discussion for further context. Here

Michael

@jgabry
Copy link
Member

jgabry commented Oct 16, 2024

Hi, thanks for reporting this. I didn't even know RStan called apropos but I guess many years ago it was used here on line 82:

# find possibly identical stanmodels
model_re <- "(^[[:alnum:]]{2,}.*$)|(^[A-E,G-S,U-Z,a-z].*$)|(^[F,T].+)"
if(!is.null(model_name))
if(!grepl(model_re, model_name))
stop("model name must match ", model_re)
S4_objects <- apropos(model_re, mode="S4", ignore.case = FALSE)
if (length(S4_objects) > 0) {
e <- environment()
stanfits <- sapply(mget(S4_objects, envir = e, inherits = TRUE),
FUN = is, class2 = "stanfit")
stanmodels <- sapply(mget(S4_objects, envir = e, inherits = TRUE),
FUN = is, class2 = "stanmodel")
if (any(stanfits)) for (i in names(which(stanfits))) {
obj <- get_stanmodel(get(i, envir = e, inherits = TRUE))
if (identical(obj@model_code[1], stanc_ret$model_code[1])) return(obj)
}
if (any(stanmodels)) for (i in names(which(stanmodels))) {
obj <- get(i, envir = e, inherits = TRUE)
if (identical(obj@model_code[1], stanc_ret$model_code[1])) return(obj)
}
}

But I'm not sure what it should be changed to. Is there a recommended alternative to apropos that wouldn't cause this problem? Looking at the RStan code here, it seems like it's trying to find existing S4 objects that are stanmodel objects identical to the one being created, in which case it returns the existing one instead of creating a new one and compiling again (@bgoodri is that right?).

@emstruong
Copy link
Author

Hi, thanks for reporting this. I didn't even know RStan called apropos but I guess many years ago it was used here on line 82:

# find possibly identical stanmodels
model_re <- "(^[[:alnum:]]{2,}.*$)|(^[A-E,G-S,U-Z,a-z].*$)|(^[F,T].+)"
if(!is.null(model_name))
if(!grepl(model_re, model_name))
stop("model name must match ", model_re)
S4_objects <- apropos(model_re, mode="S4", ignore.case = FALSE)
if (length(S4_objects) > 0) {
e <- environment()
stanfits <- sapply(mget(S4_objects, envir = e, inherits = TRUE),
FUN = is, class2 = "stanfit")
stanmodels <- sapply(mget(S4_objects, envir = e, inherits = TRUE),
FUN = is, class2 = "stanmodel")
if (any(stanfits)) for (i in names(which(stanfits))) {
obj <- get_stanmodel(get(i, envir = e, inherits = TRUE))
if (identical(obj@model_code[1], stanc_ret$model_code[1])) return(obj)
}
if (any(stanmodels)) for (i in names(which(stanmodels))) {
obj <- get(i, envir = e, inherits = TRUE)
if (identical(obj@model_code[1], stanc_ret$model_code[1])) return(obj)
}
}

But I'm not sure what it should be changed to. Is there a recommended alternative to apropos that wouldn't cause this problem? Looking at the RStan code here, it seems like it's trying to find existing S4 objects that are stanmodel objects identical to the one being created, in which case it returns the existing one instead of creating a new one and compiling again (@bgoodri is that right?).

Hi, perhaps something like ls() or objects() might be made to be more appropriate? apropos() calls grep() on ls() internally, anyways. I'm not deep enough into S4 objects and the RStan codebase to know if this would be satisfactory.

The 'real issue' is that because apropos() internally invokes search(), this ends up eventually triggering conflicted when vapply() is applied onto the .conflicted (mis-spelling?) object, which brings us to this issue and the advice to not use apropos in libraries.

@ellisp
Copy link

ellisp commented Nov 6, 2024

Here is a reprex of the problem as it appeared to me.

library(conflicted)
library(rstan)
library(dplyr)

scode <- "
parameters {
  array[2] real y;
}
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
}
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE)

@hadley
Copy link

hadley commented Nov 6, 2024

The problem is that because inherits = TRUE and mode = "S4" stan is effectively evaluating every symbol in every environment that is a parent of the function environment (i.e. all the packages that it imports before eventually getting to the global environment). I think you are probably more likely to find the model that you're looking for by doing e <- parent.frame() and then setting inherits = FALSE, but you might also avoid the conflict with conflicted by dropping the mode check (or doing it yourself later).

...

After a little more investigation part of the problem, I noticed that model_re is pretty broad, so I don't think dropping mode is likely to have much impact. I also forgot that apropos() doesn't take an environment, but I think you could do something like this:

is_stan_model <- function(x) {
  isS4(obj) && (is(obj, "stanfit") || is(obj, "stanmodel"))
}

find_matching_model <- function(model_code, env) {
  model_re <- "(^[[:alnum:]]{2,}.*$)|(^[A-E,G-S,U-Z,a-z].*$)|(^[F,T].+)" 

  names <- ls(envir = env)
  candidate_names <- names[grepl(model_re, names)]

  for (name in candidate_names) {
    obj <- get(name, envir = env, inherits = FALSE)
    if (is_stan_model(obj) && identical(obj@model_code[1], model_code)) {
      return(obj)
    }
  }
  
  NULL
}

If you wanted to be extra safe to preserve the existing behaviour, you could call find_matching_model with both parent.frame() and globalenv() (when they are different).

@jgabry
Copy link
Member

jgabry commented Nov 15, 2024

@hadley Thanks for the clarifying comments and the code suggestions, much appreciated. @bgoodri what do you think about using @hadley's suggestion? I can put together a PR for it if you're on board with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants