RAM keep increasing until crash when run many brms/Stan models in parallel based on futures #643
Unanswered
Yunrui-Liu
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
@HenrikBengtsson
In my project, I plan to run thousands of bayesian multilevel model using brm package, which means it’s time-consuming. So I try to use furrr package to run these models in parallel. Everything is good, but the RAM keep increasing until R crash. As shown in the attached figure, the memory usage ramp up over the night from Friday to Saturday, then max out at ~250 GB (insane amount of RAM) and crashed.
I already used rm(fit) and gc() command after I save the model results. I also tried use plan(callr, workers = parallel::detectCores()-1) instead of plan(multisession, workers = parallel::detectCores()-1). But it does not make a big difference.
I am wondering do you have any experience about run brms in parallel efficiently and quickly, and how to solve the memory problem.
Here is details of code:
rm(list=ls())
#Load Packages
library(tidyverse)
library(dplyr)
library(rstan)
library(lme4)
library(brms)
library(tidybayes)
library(parallel)
library(furrr)
#Set working directory
setwd(“…/data”)
data_path ← “…/Analysis”
#Load data
#Create index
index ← crossing(
Event = c(unique(nested.psw_small$Event),“none”), #Event_length=15
Trait = unique(bfi_wide$Trait), # Trait_length=5
RE = c(“int”, “intslope”), # RE_length=2
Match = c(“matched”, “unmatched”)) %>% # match_length = 2
filter(!(RE == “int” & Event == “none”) &
!(Match == “matched” & Event == “none”) &
!(RE == “int” & Match == “unmatched”))
#Create multilevel model function
growth_fun ← function(event, trait, re, match){
lapply(1:10, function(x){
rstan_options(auto_write = TRUE)
set prior
Prior <- c(set_prior("cauchy(0,1)", class = "sd"), set_prior("cauchy(0,1)", class = "sigma"))
get formula
if(re == "intslope" & event != "none"){
f <- value ~ new.wavele.group + (new.wave|PROC_SID)
}else if (re == "int" & match == "matched"){
f <- value ~ new.wavele.group + (1|PROC_SID)
}else{
f <- value ~ new.wave + (new.wave|PROC_SID)
}
set specifications for models with convergence issues
if((trait == "A" & event %in% c("ChldMvOut", "MoveIn", "ChldBrth")) |
(trait == "E" & event %in% c("ParDied")) |
(trait == "C" & event %in% c("Retire")) |
(trait == "O" & event %in% c("Unemploy")) |
event %in% c("PartDied", "FrstJob", "Divorce", "LeftPar")){
Iter <- 8000; Warmup <- 4000; treedepth <- 20
} else {Iter <- 2000; Warmup <- 1000; treedepth <- 10}
run models
fit <- brm(formula = f, data = df, prior = Prior, iter = Iter, warmup = Warmup,
control = list(adapt_delta = 0.99, max_treedepth = treedepth))
file <- sprintf("%s/data/Results_data/Test/%s_%s_%s_chain%s_%s.RData", data_path, trait, event, match, x, re)
save(fit, file = file)
rm(fit)
gc()
})
}
#Run multilevel model
plan(multisession, workers = parallel::detectCores()-1, gc = TRUE)
index %>%
mutate(mod = future_pmap(list(Event, Trait, RE, Match), growth_fun))
Beta Was this translation helpful? Give feedback.
All reactions