Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major update that improves support for formulas specification #582

Open
wants to merge 39 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
4cf4953
Major update that improves support for formulas specification
stefvanbuuren Sep 11, 2023
ea84be3
Convert documentation Rd tags to markdown tags for roxygen2
stefvanbuuren Sep 11, 2023
5c6bee2
Add a data argument to nimp() to calculate number of imputations per …
stefvanbuuren Sep 12, 2023
755c23a
Restore classic predictorMatrix behaviour that sets predictorMatrix[j…
stefvanbuuren Sep 13, 2023
c2da03c
Clean up source, identicate that there is still a problem with edit.s…
stefvanbuuren Sep 13, 2023
28821a6
Create a make.nest(), n2b() and b2n() function for working with nest …
stefvanbuuren Sep 13, 2023
731bf25
Insist that predictorMatrix has a zero diagonal
stefvanbuuren Sep 13, 2023
8f92307
- Prevention of NA propagation
stefvanbuuren Sep 18, 2023
772c876
Add exit checks on mids object
stefvanbuuren Sep 18, 2023
465bd5c
Add test for zero predictorMatrix row if method == "", deal with rela…
stefvanbuuren Sep 18, 2023
c8ed335
Update news
stefvanbuuren Sep 18, 2023
05a0209
Update documentation for mice() arguments
stefvanbuuren Sep 18, 2023
6033fc6
Update list of builtin imputation methods
stefvanbuuren Sep 18, 2023
29fee22
Reorder sequence of mice() arguments
stefvanbuuren Sep 18, 2023
fef881b
Reorder nest in data sequence
stefvanbuuren Sep 19, 2023
ba383eb
Use lowercase 'b' and 'f' for automatic naming of blocks and formulas
stefvanbuuren Sep 19, 2023
4175534
Update error message in mpmm
stefvanbuuren Sep 19, 2023
0166992
Sort terms both for pred and formulas
stefvanbuuren Sep 19, 2023
35b6084
Create a mechanism to inform check.method() of the set of variables t…
stefvanbuuren Sep 21, 2023
65f544f
Introduce NA types in initialize.imp()
stefvanbuuren Sep 21, 2023
d9c6fa6
Update nest printing in print.mids()
stefvanbuuren Sep 21, 2023
b9e398e
Add support for blots to multivariate imputation models
stefvanbuuren Sep 21, 2023
0345ec3
Rename `nest` to `parcel`
stefvanbuuren Sep 21, 2023
07a79e9
Use lower case default block names
stefvanbuuren Sep 21, 2023
53916f4
Rename `blots` to `dots`
stefvanbuuren Sep 21, 2023
3c09055
Rename files from blots/nest to dots/parcel
stefvanbuuren Sep 21, 2023
3cebc30
Add deprecation support for make.blots()
stefvanbuuren Sep 21, 2023
7b7a17c
Implement autoremove in check.predictorMatrix() and check.formulas()
stefvanbuuren Sep 21, 2023
8c4bb38
Write one loggedEvent for each removed variable
stefvanbuuren Sep 22, 2023
24688b1
Abort mice when user speficies mixes of `formulas` and `predictorMatr…
stefvanbuuren Sep 22, 2023
e1c475f
Update NEWS.md
stefvanbuuren Sep 22, 2023
da6396b
Reorder mice() arguments into a clusters of operations
stefvanbuuren Oct 2, 2023
db5caf6
Remove superfluous construct.parcel(), make remove.rhs.variables() in…
stefvanbuuren Oct 2, 2023
f5d5c99
Add MICE 4 Syntax Documentation CONCEPT as a vignette
stefvanbuuren Oct 2, 2023
6edcd71
Rebuild site to include article mice4syntax
stefvanbuuren Oct 2, 2023
232a0b6
Add test for character variable (#601)
stefvanbuuren Apr 17, 2024
09e58ea
Merge main and support_blocks into new branch mice4 (still failing so…
stefvanbuuren Apr 17, 2024
15321b4
Merging update
stefvanbuuren Apr 17, 2024
deac372
Update support_blocks with master
stefvanbuuren Nov 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions R/formula.R
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,8 @@ name.formulas <- function(formulas, prefix = "f") {
}


check.formulas <- function(formulas, data) {
check.formulas <- function(formulas, data,
autoremove = TRUE) {
formulas <- name.formulas(formulas)
formulas <- handle.oldstyle.formulas(formulas, data)
formulas <- lapply(formulas, expand.dots, data)
Expand All @@ -143,7 +144,14 @@ check.formulas <- function(formulas, data) {
completevars <- colnames(data)[!apply(is.na(data), 2, sum)]
uip <- setdiff(notimputed, completevars)
# if any of these are in RHS for formulas, remove them
formulas <- lapply(formulas, remove.rhs.variables, vars = uip)
removeme <- intersect(uip, as.vector(sapply(formulas, all.vars)))
if (length(removeme) && autoremove) {
formulas <- lapply(formulas, remove.rhs.variables, vars = removeme)
vars <- paste(removeme, collapse = ",")
updateLog(out = paste("incomplete predictor(s)", vars),
meth = "check", frame = 1)
}

# add components y ~ 1 for y to formulas
for (y in notimputed) {
formulas[[y]] <- as.formula(paste(y, "~ 1"))
Expand Down
19 changes: 12 additions & 7 deletions R/mice.R
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,8 @@
#' variable group (or block) to which each variable is
#' allocated.
#' @param blots Deprecated. Replaced by `dots`.
#' @param autoremove Logical. Should unimputed incomplete predictors be removed
#' to prevent NA propagation?
#'
#' @return Returns an S3 object of class [`mids()`][mids-class]
#' (multiply imputed data set)
Expand Down Expand Up @@ -413,6 +415,7 @@ mice <- function(data,
seed = NA,
data.init = NULL,
blots = NULL,
autoremove = TRUE,
...) {
call <- match.call()

Expand All @@ -424,6 +427,10 @@ mice <- function(data,
dots <- blots
}

# data frame for storing the event log
state <- list(it = 0, im = 0, dep = "", meth = "", log = FALSE)
loggedEvents <- data.frame(it = 0, im = 0, dep = "", meth = "", out = "")

if (!is.na(seed)) set.seed(seed)

# check form of data and m
Expand Down Expand Up @@ -451,7 +458,8 @@ mice <- function(data,
# case B
if (!mp & mb & mf) {
# predictorMatrix leads
predictorMatrix <- check.predictorMatrix(predictorMatrix, data)
predictorMatrix <- check.predictorMatrix(predictorMatrix, data,
autoremove = autoremove)
blocks <- make.blocks(colnames(predictorMatrix), partition = "scatter")
formulas <- make.formulas(data, blocks, predictorMatrix = predictorMatrix)
}
Expand All @@ -467,15 +475,16 @@ mice <- function(data,
# case D
if (mp & mb & !mf) {
# formulas leads
formulas <- check.formulas(formulas, data)
formulas <- check.formulas(formulas, data, autoremove = autoremove)
blocks <- construct.blocks(formulas)
predictorMatrix <- f2p(formulas, data, blocks)
}

# case E
if (!mp & !mb & mf) {
# predictor leads (use for multivariate imputation)
predictorMatrix <- check.predictorMatrix(predictorMatrix, data)
predictorMatrix <- check.predictorMatrix(predictorMatrix, data,
autoremove = autoremove)
blocks <- check.blocks(blocks, data, calltype = "pred")
formulas <- make.formulas(data, blocks, predictorMatrix = predictorMatrix)
}
Expand Down Expand Up @@ -571,10 +580,6 @@ mice <- function(data,
dots <- check.dots(dots, data, blocks)
ignore <- check.ignore(ignore, data)

# data frame for storing the event log
state <- list(it = 0, im = 0, dep = "", meth = "", log = FALSE)
loggedEvents <- data.frame(it = 0, im = 0, dep = "", meth = "", out = "")

# edit imputation setup
setup <- list(
method = method,
Expand Down
19 changes: 17 additions & 2 deletions R/predictorMatrix.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ make.predictorMatrix <- function(data, blocks = make.blocks(data),

check.predictorMatrix <- function(predictorMatrix,
data,
blocks = NULL) {
blocks = NULL,
autoremove = TRUE) {
data <- check.dataform(data)

if (!is.matrix(predictorMatrix)) {
Expand Down Expand Up @@ -82,9 +83,23 @@ check.predictorMatrix <- function(predictorMatrix,
)
}

# calculate ynames (variables to impute) for use in check.method()
# NA-propagation prevention
# find all dependent (imputed) variables
hit <- apply(predictorMatrix, 1, function(x) any(x != 0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified to: apply(predictorMatrix != 0, 1, any)

ynames <- row.names(predictorMatrix)[hit]
# find all variables in data that are not imputed
notimputed <- setdiff(colnames(data), ynames)
# select uip: unimputed incomplete predictors
completevars <- colnames(data)[!apply(is.na(data), 2, sum)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!apply(is.na(data), 2, any) might be more efficient

uip <- setdiff(notimputed, completevars)
# if any of these are predictors, remove them
removeme <- intersect(uip, colnames(predictorMatrix))
if (length(removeme) && autoremove) {
predictorMatrix[, removeme] <- 0
vars <- paste(removeme, collapse = ",")
updateLog(out = paste("incomplete predictor(s)", vars),
meth = "check", frame = 1)
}

# grow predictorMatrix to all variables in data
if (ncol(predictorMatrix) < ncol(data)) {
Expand Down
4 changes: 4 additions & 0 deletions man/mice.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.