Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intended behavior when data_ice has no rows #460

Closed
tobiasmuetze opened this issue Dec 5, 2024 · 4 comments · Fixed by #461
Closed

Intended behavior when data_ice has no rows #460

tobiasmuetze opened this issue Dec 5, 2024 · 4 comments · Fixed by #461
Labels
bug Something isn't working

Comments

@tobiasmuetze
Copy link

Describe the bug
I am trying to understand the intended behavior of draws() when the data_ice argument has no rows. When creating data_ice results in a tibble or data.frame with 0 rows, then draws() gives the error "Error in longdata$set_strategies(data_ice) : object 'has_nonMAR_to_MAR' not found"

This is not a particular helpful error for debugging. I am wondering 1) whether this has to result in an error or if a warning might be more appropriate, and 2) if it has to result in an error, could this be made more meaningful?

To Reproduce

I'll use the code from the vignette as an example.

library(rbmi)
library(dplyr)

data("antidepressant_data")
dat <- antidepressant_data

# Use expand_locf to add rows corresponding to visits with missing outcomes to the dataset
dat <- expand_locf(
  dat,
  PATIENT = levels(dat$PATIENT), # expand by PATIENT and VISIT 
  VISIT = levels(dat$VISIT),
  vars = c("BASVAL", "THERAPY"), # fill with LOCF BASVAL and THERAPY
  group = c("PATIENT"),
  order = c("PATIENT", "VISIT")
)

# Create data_ice and set the imputation strategy to JR for
# each patient with at least one missing observation
dat_ice <- dat %>% 
  arrange(PATIENT, VISIT) %>% 
  filter(is.na(CHANGE)) %>% 
  group_by(PATIENT) %>% 
  slice(1) %>%
  ungroup() %>% 
  select(PATIENT, VISIT) %>% 
  mutate(strategy = "JR") %>% 
  filter(VISIT==10)

# In this dataset, subject 3618 has an intermittent missing values which does not correspond
# to a study drug discontinuation. We therefore remove this subject from `dat_ice`. 
# (In the later imputation step, it will automatically be imputed under the default MAR assumption.)
dat_ice <- dat_ice[-which(dat_ice$PATIENT == 3618),]

# Define the names of key variables in our dataset using `set_vars()`
# and the covariates included in the imputation model
# Note that the covariates argument can also include interaction terms
vars <- set_vars(
  outcome = "CHANGE",
  visit = "VISIT",
  subjid = "PATIENT",
  group = "THERAPY",
  covariates = c("BASVAL*VISIT", "THERAPY*VISIT")
)

# Define which imputation method to use (here: Bayesian multiple imputation with 150 imputed datsets) 
method <- method_bayes(
  burn_in = 200,
  burn_between = 5,
  n_samples = 150
)


# Create samples for the imputation parameters by running the draws() function
set.seed(987)
drawObj <- draws(
  data = dat,
  data_ice = dat_ice,
  vars = vars,
  method = method,
  quiet = TRUE
)

Environment (please complete the following information):

  • OS: Linux
  • R version 4.3.1
  • rbmi version 1.2.6
@tobiasmuetze tobiasmuetze added the bug Something isn't working label Dec 5, 2024
@gowerc
Copy link
Collaborator

gowerc commented Dec 9, 2024

Heya,

Thanks for reporting. That definitely looks like a bug so will take a look.

Double checking the documentation and my notes I believe the intended default behaviour is that if any subject is missing from data_ice then all of their observations should be regarded as having occurred before an ICE thus all data should be included into the imputation model and any missing observations should be imputed under an implicit MAR assumption.

@nociale - Just wanted to double check what I've written still meets your expectations?

The error is a bit suspicious to me though as we actually had a bug with that section of the code that we fixed when releasing v1.3.0 #432

@tobiasmuetze
Copy link
Author

@gowerc Thanks! Please note that I haven't tested it yet in v1.3.0. We are still on v1.2.6 in our system.

@gowerc
Copy link
Collaborator

gowerc commented Dec 9, 2024

Ok yer this appears to be an edge case bug that was fixed with the changes we made in v1.3.0 though I'll add in some additional unit tests to hopefully guard against it accidentally being re-introduced again later.

So (based on the current code implementation rather than expectation) internally we create an object called longdata that tracks the state of each subject & visit individually. All subjects are initialised to have a strategy of MAR and all observations are initialised to be regarded as pre-ICE. When data_ice is consumed it is used to update away from this initial state; that is to say if data_ice is blank the initial state is left as-is.

The bug here was that we had a flag variable has_nonMAR_to_MAR which is used to indicate if we needed to show a warning to users however the variable wasn't initalised unless there was at least 1 subject in data_ice (which is why you are seeing the not found error).

But yer this is all fixed in v1.3.0. A simple workaround if you aren't able to upgrade to v1.3.0 should be to just include at least 1 subject in the data_ice.

Apologies for the inconvenience.

@gowerc
Copy link
Collaborator

gowerc commented Dec 9, 2024

@tobiasmuetze - I'll close this as we've already applied the fix but please feel free to re-open if you have any additional followup questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants