diff --git a/simplify.html b/simplify.html index 8c638a6..1332d72 100644 --- a/simplify.html +++ b/simplify.html @@ -2,12 +2,12 @@ - + - + choosing random effects terms in mixed models +} +!function(t,e){"object"==typeof exports&&"object"==typeof module?module.exports=e():"function"==typeof define&&define.amd?define([],e):"object"==typeof exports?exports.ClipboardJS=e():t.ClipboardJS=e()}(this,function(){return n={686:function(t,e,n){"use strict";n.d(e,{default:function(){return b}});var e=n(279),i=n.n(e),e=n(370),u=n.n(e),e=n(817),r=n.n(e);function c(t){try{return document.execCommand(t)}catch(t){return}}var a=function(t){t=r()(t);return c("cut"),t};function o(t,e){var n,o,t=(n=t,o="rtl"===document.documentElement.getAttribute("dir"),(t=document.createElement("textarea")).style.fontSize="12pt",t.style.border="0",t.style.padding="0",t.style.margin="0",t.style.position="absolute",t.style[o?"right":"left"]="-9999px",o=window.pageYOffset||document.documentElement.scrollTop,t.style.top="".concat(o,"px"),t.setAttribute("readonly",""),t.value=n,t);return e.container.appendChild(t),e=r()(t),c("copy"),t.remove(),e}var f=function(t){var e=1 - + - - - + + + + + @@ -2655,7 +3262,7 @@

choosing random effects terms in mixed models

Published
-

July 23, 2024

+

November 7, 2024

@@ -2663,16 +3270,136 @@

choosing random effects terms in mixed models

+ +

Source file is here. Also see Emi Tanaka’s approach to the example below, mostly with AS-REML.

An R-centric discussion of what to do to manage the complexity of mixed models, specifically their random effect components.

+
+

General framework

+
+

maximal model

+

The maximal model (Barr et al. 2013) is the model that includes all random effect terms that would be identifiable, given the experimental (observational) experimental design, given arbitrarily large samples (both of groups and of observations within groups), what I would call satisfying ``strong identifiability’’.

+
    +
  • all categorical variables in the data with exchangeable levels are treated as random effects (Yarkoni 2022)
  • +
  • all terms in the fixed-effect model are included as varying terms where possible, i.e. within every grouping variable where the values of the fixed-effect term vary within (and not just between) groups
  • +
  • for models where the conditional distribution has an estimated scale parameter (e.g. Gaussian, Gamma) and a varying term with \(n\) parameters (e.g. \(n=2\) for a random-slopes model), there must be at least \(n+1\) observations per group in order for the scale parameter (residual variance in the case of a Gaussian) to be identifiable, e.g. see “starlings” example here. For models with fixed scale/dispersion (e.g. Poisson, Gamma) the limitation is \(n\) observations per group (unless an observation-level random effect is added, in which case we again need \(n+1\)). For models with estimated dispersion rather than scale (e.g. negative binomial), the model is theoretically identifiable with only \(n\) observations per group, but it probably won’t work.
  • +
+
+
+

model constraint strategies

+

When we constrain the covariance matrix (i.e. any model for specific than “general positive semi-definite”), the model results and definition depend on the contrasts used. For factor terms, it might make the most sense to specify sum-to-zero contrasts unless there is a clear reference/control level against which all other levels should be compared; for continuous terms, the predictors should probably be zero centered (or at least zeroed at some sensible reference level; see the end of section 2.2 in Bates, Mächler, et al. (2015)).

+
+

compound-symmetric models

+

For factor models with many levels, compound-symmetric models are a good simplification approach. This restricts the correlations among levels to be identical for all pairs of levels (i.e., a correlation matrix with all off-diagonal elements equal to the same constant \(\rho\)).

+

For example, working with the built-in cbpp data set where period is a four-level factor, the covariance matrices implied by (period|herd) vs. (1|period/herd) are:

+

\[ +(\textrm{intercept}, \textrm{slope}) = +\textrm{MVN}\left(\boldsymbol 0, +\left[ +\begin{array}{cccc} +\sigma^2_{\{h|1\}} & . & . & . \\ +\sigma_{\{h|1\},\{h|p_{21}\}} & +\sigma^2_{\{h|p_{21}\}} & . & . \\ +\sigma_{\{h|1\}, \{h|p_{31}\}} & +\sigma_{\{h|p_{21}\},\{h|p_{31}\}} & +\sigma^2_{\{h|p_{31}\}} & . \\ +\sigma_{\{h|1\} ,\{h|p_{41}\}} & +\sigma_{\{h|p_{21}\},\{h|p_{41}\}} & +\sigma_{\{h|p_{31}\},\{h|p_{41}\}} & +\sigma^2_{\{h|p_{41}\}} +\end{array} +\right] +\right) +\] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[ +\left[ +\begin{array}{cccc} +\sigma^2 & . & . & . \\ +\rho \sigma^2 & \sigma^2 & . & . \\ +\rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\ +\rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 \\ +\end{array} +\right] +\] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{herd:period|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)

+
    +
  • The shortcut of using a nested model works in most mixed model frameworks (almost any platform that allows for multiple random effects will allow them to be nested), but is restricted to positive compound-symmetric models (i.e. \(\rho>0\)). Some packages allow general compound symmetric matrices (i.e. \(-1 < \rho < 1\)), e.g. using pdCompSymm in nlme, cs() in glmmTMB, cosy in brms (apparently MCMCglmm doesn’t have a CS struture?) …
  • +
  • We also need to distinguish between heterogeneous compound symmetry (i.e., variances are the same for every level; \(n+1\) parameters for an \(n \times n\) covariance matrix) and homogeneous compound symmetry (all variances are the same, 2 parameters regardless of the size of the cov matrix). The nesting trick described above only handles homogeneous CS. cs() in glmmTMB does heterogeneous CS, homogeneous CS could be handled by using the map argument to fix all of the standard deviations to be the same (or a homcs() structure could be added if there’s enough demand for it).
  • +
+
+
+

independence models

+

This is probably the best known approach to model simplification (e.g. as recommended by Barr et al. (2013)). It’s actually a special case of CS, with \(\rho=0\). Also called “diagonal” for obvious (?) reasons.

+
    +
  • in lme4 (or glmmTMB or brms), use || in place of |; glmmTMB allows diag() for heterogenous diagonal/independence models (and homdiag() for homogeneous); MCMCglmm uses idv() (homogeneous) and idh() (heterogeneous).
  • +
  • || (still) doesn’t work in lme4 for factor-valued terms; use afex::mixed or glmmTMB, or expand the model in dummies ((0 + dummy(f, level1)|g) + (0 + dummy(f, level2)|g) + ...; kind of a pain for large numbers of factors, although you could construct it via string processing if you wanted)
  • +
  • see general comments above about the dependence of the constrained model on the zero point of the numeric terms
  • +
+
+
+

dropping terms

+

Especially for vector-valued random effects consisting of variation in multiple numeric predictors, it may make sense to drop the random effect term for terms with very small variance, or terms that are of less interest. For example, you might reduce the random-effect term (1 + fire + NPP | biome) to (1 + fire | biome) by dropping the variation in NPP across biomes.

+
+
+

reduced-rank models

+

Factor-analytic or reduced-rank models are a more flexible approach to simplifying complex covariance functions, by doing something analogous to computing a principal components analysis on the covariance matrix and keeping a subset of the terms. In glmmTMB you can do this with the rr() covariance structure; see the relevant section of the covariance structure vignette, or this draft by McGillycuddy et al., in press in J. Stat. Software.

+
+
+
+

parameter constraint strategies

+

Instead of changing the covariance model to reduce its dimensionality, you can add a Bayesian prior (or a regularization term, if you’re not Bayesian) to make the model better behaved and/or keep it from being singular.

+
    +
  • If you want to apply a minimally informative prior that will prevent singularity, Chung et al. (2013) show that an improper Gamma(shape=2, rate=0) is the weakest Gamma prior for the standard deviation that will prevent a posterior mode at zero. (The mode will be “approximately one standard error away from zero when the maximum likelihood estimate is at zero”.) In practice this might be implemented using a very small rate parameter or very large shape parameter. A Wishart prior with \(\nu = 1 + d\) and infinite scale might (?) have similar properties for vector-valued random effects/covariance matrices larger than 1×1. The blme package implements this approach (although with a default shape of 2.5 rather than 2); it can also be implemented in glmmTMB (see the priors vignette), and of course in any of the full-Bayesian packages (brms, rstanarm, MCMCglmm, etc.)
  • +
  • If you’re a real Bayesian or are otherwise comfortable with stronger priors, you can choose from a variety of other priors (e.g. half-Normal or half-\(t\) or Gamma or log-Normal priors for standard deviations, Wishart or inverse-Wishart distributions for full covariance matrices, LKJ priors for correlations) to help out your model. (LKJ priors are a nice generalization of the diagonal covariance structures above, as an LKJ prior with \(\eta>1\) pushes the correlation matrix to be closer to diagonal without constraining it to be exactly diagonal …)
  • +
  • a real Bayesian would also say you should not choose priors just to make your model behave better (e.g. eliminate divergences in a Hamiltonian Monte Carlo run). Instead, you should use prior predictive simulations to tune the priors to limit the random-effects structures to those that produce reasonable output (and then cross your fingers that this also resolves your computational problems).
  • +
+
+
+

multiple RE terms

+

Given all these options, any combination of which might apply to different RE terms in a model with more than one random effect, how should one choose among them?

+
    +
  • if a model seemed to be sampling or converging to an optimum reasonably well but the fit was singular (i.e. on the boundary of the space of allowed covariance estimates), and the results were sufficiently robust that you could get estimates and standard errors/CIs for all of the values of interest), you might choose to proceed with the singular model. Singular fits are theoretically valid; they just make a lot of downstream procedures, such as those based on calculating curvature of the log-likelihood surface, harder. It might also be reasonable to suppose that numerical procedures could be less reliable in this case.
  • +
  • Barr et al. (2013) suggest “keep[ing] it maximal”, i.e. reducing the model only where necessary to deal convergence issues (they don’t actually mention singular fits, nor say much about the issues with convergence checks in lme4: > In cases of nonconvergence, simplification of the random effects structure proceeded as follows. For between-items designs, the by-subjects random slope was dropped. For within-items designs, statistics from the partially converged model were inspected, and the slope associated with smaller variance was dropped It doesn’t seem that they considered models with multiple random effects terms (where one wouldn’t necessarily know which term to try simplifying first …)
  • +
  • Matuschek et al. (2017) prefer a model selection approach; they lay out a stepwise approach based on likelihood ratio tests with a relaxed threshold (\(\alpha = 0.2\)) for retaining terms. Based on a model with two random-slopes terms, they suggest a sequence1
  • +
+
    +
  1. full model
  2. +
  3. independent slopes and intercepts for both terms [\(\rho = 0\)]
  4. +
  5. drop the random slope for one term [item-specific random slopes]
  6. +
  7. drop the random slope for the other term [subject-specific random slopes]
  8. +
  9. drop both random slopes They also consider AIC-based selection.
  10. +
+
    +
  • Moritz, Batllori, and Bolker (2023) used a more complex model than either of the references above: the among-group variation of an intercept and four numeric predictors (i.e., a 5×5 covariance matrix), applied at three different grouping levels. They consider the possibilities of (1) a full (unconstrained) covariance matrix, (2) a diagonal/independent covariance, or (3) an intercept-only random effect, for each grouping variable, i.e. a total of \(3^3 = 81\) possible models. They fitted all 81 models and chose the model with the best AIC among only the models with non-singular fits
  • +
  • Singmann and Kellen (2019) doesn’t present any methods beyond those suggested above, but the section on “Specifying the Random Effects Structure” has a clear description
  • +
  • several R packages have implemented model selection machinery for the random effects (most of the options, e.g. MuMIn, glmulti, glmmLasso, do model selection only on the fixed effects component) +
      +
    • The buildmer package is probably the fanciest: it
    • +
    +
    +

    [f]inds the largest possible regression model that will still converge for various types of regression analyses … and then optionally performs stepwise elimination similar to the forward and backward effect-selection methods in SAS, based on the change in log-likelihood or its significance, Akaike’s Information Criterion, the Bayesian Information Criterion, the explained deviance, or the F-test of the change in R². Its allowed steps appear to include both including/excluding particular terms from the RE term for a particular grouping variable, as well as including or excluding complete terms (i.e. adding or dropping a grouping variable)

    +
    +
      +
    • step.lmerModLmerTest() from lmerTest does backward stepwise selection (among random-effects terms, i.e. not considering simplification of individual terms as discussed above) based on likelihood ratio tests
    • +
    • ffRanefLMER.fnc() from LMERConvenienceFunctions does forward selection (the description of the ran.effects argument that specifies the possible random effects is complicated …)
    • +
    • the asremlPlus package has a lot of machinery, and some detailed examples, for model building and selection using AS-REML
    • +
  • +
+
+
+
+

Example

setup

library(lme4)
 library(Matrix)
-library(glmmTMB)
+library(glmmTMB) +## For rlkj_corr_cholesky; LKJ simulation also available in `rethinking` package +## (GitHub only: remotes::install_github("rmcelreath/rethinking")) +library(nimble) +library(blme) +conflicted::conflict_prefer("simulate", "stats")

For this example we’ll think about a forest restoration experiment with four plots; 5 restoration treatments that are applied in each plot (a randomized complete block design); and 6 replicates within each block/treatment.

@@ -2695,8 +3422,8 @@

setup

sigma = 1)))[[1]]
-
-

maximal model

+
+

maximal model

We would like to be able to fit ~ ttt + (ttt|plot), but that’s very unlikely to actually work:

m1 <- lmer(y ~ ttt + (ttt|plot), data  = dd)
@@ -2740,49 +3467,16 @@

diagnostics

image(Matrix(r$plot$rotation))
-

+
+
+

+
+
-
-

model constraint strategies

-

When we constrain the covariance matrix (i.e. any model for specific than “general positive semi-definite”), the model results and definition depend on the contrasts used. For factor terms, it might make the most sense to specify sum-to-zero contrasts unless there is a clear reference/control level against which all other levels should be compared; for continuous terms, the predictors should probably be zero centered (or at least zeroed at some sensible reference level; see the end of section 2.2 in Bates, Mächler, et al. (2015)).

-
-

compound-symmetric models

-

For factor models with many levels, compound-symmetric models are a good simplification approach. This restricts the correlations among levels to be identical for all pairs of levels (i.e., a correlation matrix with all off-diagonal elements equal to the same constant \(\rho\)).

-

For example, working with the built-in cbpp data set where period is a four-level factor, the covariance matrices implied by (period|herd) vs. (1|period/herd) are:

-

\[ -(\textrm{intercept}, \textrm{slope}) = -\textrm{MVN}\left(\boldsymbol 0, -\left[ -\begin{array}{cccc} -\sigma^2_{\{h|1\}} & . & . & . \\ -\sigma_{\{h|1\},\{h|p_{21}\}} & -\sigma^2_{\{h|p_{21}\}} & . & . \\ -\sigma_{\{h|1\}, \{h|p_{31}\}} & -\sigma_{\{h|p_{21}\},\{h|p_{31}\}} & -\sigma^2_{\{h|p_{31}\}} & . \\ -\sigma_{\{h|1\} ,\{h|p_{41}\}} & -\sigma_{\{h|p_{21}\},\{h|p_{41}\}} & -\sigma_{\{h|p_{31}\},\{h|p_{41}\}} & -\sigma^2_{\{h|p_{41}\}} -\end{array} -\right] -\right) -\] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[ -\left[ -\begin{array}{cccc} -\sigma^2 & . & . & . \\ -\rho \sigma^2 & \sigma^2 & . & . \\ -\rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\ -\rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 \\ -\end{array} -\right] -\] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{herd:period|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)

-
    -
  • The shortcut of using a nested model works in most mixed model frameworks (almost any platform that allows for multiple random effects will allow them to be nested), but is restricted to positive compound-symmetric models (i.e. \(\rho>0\)). Some packages allow general compound symmetric matrices (i.e. \(-1 < \rho < 1\)), e.g. using pdCompSymm in nlme, cs() in glmmTMB, cosy in brms (apparently MCMCglmm doesn’t have a CS struture?) …
  • -
  • We also need to distinguish between heterogeneous compound symmetry (i.e., variances are the same for every level; \(n+1\) parameters for an \(n \times n\) covariance matrix) and homogeneous compound symmetry (all variances are the same, 2 parameters regardless of the size of the cov matrix). The nesting trick described above only handles homogeneous CS. cs() in glmmTMB does heterogeneous CS, homogeneous CS could be handled by using the map argument to fix all of the standard deviations to be the same (or a homcs() structure could be added if there’s enough demand for it).
  • -
+
+

constraining the model

m1_homcs_lmer <- lmer(y ~ ttt + (1|plot/ttt), data  = dd)
 m1_homcs_glmmTMB <- glmmTMB(y ~ ttt + (1|plot/ttt), data  = dd,
@@ -2790,28 +3484,16 @@ 

compound-symmetr m1_hetcs_glmmTMB <- glmmTMB(y ~ ttt + cs(ttt|plot), data = dd, REML = TRUE)

-
-
-

independence models

-

This is probably the best known approach to model simplification (e.g. as recommended by Barr et al. (2013)). It’s actually a special case of CS, with \(\rho=0\). Also called “diagonal” for obvious (?) reasons.

-
    -
  • in lme4 (or glmmTMB or brms), use || in place of |; glmmTMB allows diag() for heterogenous diagonal/independence models (and homdiag() for homogeneous); MCMCglmm uses idv() (homogeneous) and idh() (heterogeneous).
  • -
  • || (still) doesn’t work in lme4 for factor-valued terms; use afex::mixed or glmmTMB, or expand the model in dummies ((0 + dummy(f, level1)|g) + (0 + dummy(f, level2)|g) + ...; kind of a pain for large numbers of factors, although you could construct it via string processing if you wanted)
  • -
  • see general comments above about the dependence of the constrained model on the zero point of the numeric terms
  • -
+
+

independence models

m1_diag_glmmTMB <- glmmTMB(y ~ ttt + diag(ttt|plot), data  = dd,
                            REML = TRUE)
 ## also try afex::mixed?
-
-

dropping terms

-

Especially for vector-valued random effects consisting of variation in multiple numeric predictors, it may make sense to drop the random effect term for terms with very small variance, or terms that are of less interest. For example, you might reduce the random-effect term (1 + fire + NPP | biome) to (1 + fire | biome) by dropping the variation in NPP across biomes.

-
-
-

reduced-rank models

-

Factor-analytic or reduced-rank models are a more flexible approach to simplifying complex covariance functions, by doing something analogous to computing a principal components analysis on the covariance matrix and keeping a subset of the terms. In glmmTMB you can do this with the rr() covariance structure; see the relevant section of the covariance structure vignette, or this draft by McGillycuddy et al. in press in J. Stat. Software.

+
+

reduced-rank models

m1_rr1_glmmTMB <- glmmTMB(y ~ ttt + rr(ttt|plot, d=1), data  = dd,
                            REML = TRUE)
@@ -2825,14 +3507,8 @@ 

reduced-rank models

-
-

parameter constraint strategies

-

Instead of changing the covariance model to reduce its dimensionality, you can add a Bayesian prior (or a regularization term, if you’re not Bayesian) to make the model better behaved and/or keep it from being singular.

-
    -
  • If you want to apply a minimally informative prior that will prevent singularity, Chung et al. (2013) show that an improper Gamma(shape=2, rate=0) is the weakest Gamma prior for the standard deviation that will prevent a posterior mode at zero. (The mode will be “approximately one standard error away from zero when the maximum likelihood estimate is at zero”.) In practice this might be implemented using a very small rate parameter or very large shape parameter. A Wishart prior with \(\nu = 1 + d\) and infinite scale might (?) have similar properties for vector-valued random effects/covariance matrices larger than 1×1. The blme package implements this approach (although with a default shape of 2.5 rather than 2); it can also be implemented in glmmTMB (see the priors vignette), and of course in any of the full-Bayesian packages (brms, rstanarm, MCMCglmm, etc.)
  • -
  • If you’re a real Bayesian or are otherwise comfortable with stronger priors, you can choose from a variety of other priors (e.g. half-Normal or half-\(t\) or Gamma or log-Normal priors for standard deviations, Wishart or inverse-Wishart distributions for full covariance matrices, LKJ priors for correlations) to help out your model. (LKJ priors are a nice generalization of the diagonal covariance structures above, as an LKJ prior with \(\eta>1\) pushes the correlation matrix to be closer to diagonal without constraining it to be exactly diagonal …)
  • -
  • a real Bayesian would also say you should not choose priors just to make your model behave better (e.g. eliminate divergences in a Hamiltonian Monte Carlo run). Instead, you should use prior predictive simulations to tune the priors to limit the random-effects structures to those that produce reasonable output (and then cross your fingers that this also resolves your computational problems).
  • -
+
+

parameter constraints/regularization

Constraining just the standard deviations to be positive doesn’t help us: the 5×5 correlation matrix is only rank-4 … a singularity-avoiding prior for the correlation matrix isn’t available yet.

gprior <- data.frame(prior = "gamma(1e8, 2.5)",
@@ -2855,7 +3531,7 @@ 

parameter
cormat <- cov2cor(VarCorr(m1_us_prior_glmmTMB)$cond[[1]])
 print(eigen(cormat)$value, digits = 3)
-
[1] 3.85e+00 8.44e-01 2.15e-01 8.85e-02 6.46e-11
+
[1] 3.81e+00 8.32e-01 2.62e-01 9.17e-02 4.93e-11

@@ -2879,38 +3555,6 @@

AIC comparison

-
-

multiple RE terms

-

Given all these options, any combination of which might apply to different RE terms in a model with more than one random effect, how should one choose among them?

-
    -
  • if a model seemed to be sampling or converging to an optimum reasonably well but the fit was singular (i.e. on the boundary of the space of allowed covariance estimates), and the results were sufficiently robust that you could get estimates and standard errors/CIs for all of the values of interest), you might choose to proceed with the singular model. Singular fits are theoretically valid; they just make a lot of downstream procedures, such as those based on calculating curvature of the log-likelihood surface, harder. It might also be reasonable to suppose that numerical procedures could be less reliable in this case.
  • -
  • Barr et al. (2013) suggest “keep[ing] it maximal”, i.e. reducing the model only where necessary to deal convergence issues (they don’t actually mention singular fits, nor say much about the issues with convergence checks in lme4: > In cases of nonconvergence, simplification of the random effects structure proceeded as follows. For between-items designs, the by-subjects random slope was dropped. For within-items designs, statistics from the partially converged model were inspected, and the slope associated with smaller variance was dropped It doesn’t seem that they considered models with multiple random effects terms (where one wouldn’t necessarily know which term to try simplifying first …)
  • -
  • Matuschek et al. (2017) prefer a model selection approach; they lay out a stepwise approach based on likelihood ratio tests with a relaxed threshold (\(\alpha = 0.2\)) for retaining terms. Based on a model with two random-slopes terms, they suggest a sequence1
  • -
-
    -
  1. full model
  2. -
  3. independent slopes and intercepts for both terms [\(\rho = 0\)]
  4. -
  5. drop the random slope for one term [item-specific random slopes]
  6. -
  7. drop the random slope for the other term [subject-specific random slopes]
  8. -
  9. drop both random slopes They also consider AIC-based selection.
  10. -
-
    -
  • Moritz, Batllori, and Bolker (2023) used a more complex model than either of the references above: the among-group variation of an intercept and four numeric predictors (i.e., a 5×5 covariance matrix), applied at three different grouping levels. They consider the possibilities of (1) a full (unconstrained) covariance matrix, (2) a diagonal/independent covariance, or (3) an intercept-only random effect, for each grouping variable, i.e. a total of \(3^3 = 81\) possible models. They fitted all 81 models and chose the model with the best AIC among only the models with non-singular fits
  • -
  • Singmann and Kellen (2019) doesn’t present any methods beyond those suggested above, but the section on “Specifying the Random Effects Structure” has a clear description
  • -
  • several R packages have implemented model selection machinery for the random effects (most of the options, e.g. MuMIn, glmulti, glmmLasso, do model selection only on the fixed effects component) -
      -
    • The buildmer package is probably the fanciest: it
    • -
    -
    -

    [f]inds the largest possible regression model that will still converge for various types of regression analyses … and then optionally performs stepwise elimination similar to the forward and backward effect-selection methods in SAS, based on the change in log-likelihood or its significance, Akaike’s Information Criterion, the Bayesian Information Criterion, the explained deviance, or the F-test of the change in R². Its allowed steps appear to include both including/excluding particular terms from the RE term for a particular grouping variable, as well as including or excluding complete terms (i.e. adding or dropping a grouping variable)

    -
    -
      -
    • step.lmerModLmerTest() from lmerTest does backward stepwise selection (among random-effects terms, i.e. not considering simplification of individual terms as discussed above) based on likelihood ratio tests
    • -
    • ffRanefLMER.fnc() from LMERConvenienceFunctions does forward selection (the description of the ran.effects argument that specifies the possible random effects is complicated …)
    • -
    • the asremlPlus package has a lot of machinery, and some detailed examples, for model building and selection using AS-REML
    • -
  • -
-

to do

    @@ -2921,31 +3565,35 @@

    to do

    +
-

references

-
+

references

+
Barr, Dale J., Roger Levy, Christoph Scheepers, and Harry J. Tily. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68 (3): 255–78. https://doi.org/10.1016/j.jml.2012.11.001.
-
+
Bates, Douglas, Reinhold Kliegl, Shravan Vasishth, and Harald Baayen. 2015. “Parsimonious Mixed Models.” arXiv:1506.04967 [Stat], June. http://arxiv.org/abs/1506.04967.
-
+
Bates, Douglas, Martin Mächler, Benjamin M. Bolker, and Steven C. Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
-
+
Chung, Yeojin, Sophia Rabe-Hesketh, Vincent Dorie, Andrew Gelman, and Jingchen Liu. 2013. “A Nondegenerate Penalized Likelihood Estimator for Variance Parameters in Multilevel Models.” Psychometrika, 1–25. https://doi.org/10.1007/s11336-013-9328-2.
-
+
Matuschek, Hannes, Reinhold Kliegl, Shravan Vasishth, Harald Baayen, and Douglas Bates. 2017. “Balancing Type I Error and Power in Linear Mixed Models.” Journal of Memory and Language 94: 305–15. https://doi.org/10.1016/j.jml.2017.01.001.
-
+
Moritz, Max A., Enric Batllori, and Benjamin M. Bolker. 2023. “The Role of Fire in Terrestrial Vertebrate Richness Patterns.” Ecology Letters 26 (4): 563–74. https://doi.org/10.1111/ele.14177.
-
+
Singmann, Henrik, and David Kellen. 2019. “An Introduction to Mixed Models for Experimental Psychology.” In New Methods in Cognitive Psychology, edited by Daniel Spieler and Eric Schumacher, 1st ed., 4–31. Routledge. https://doi.org/10.4324/9780429318405-2.
+
+Yarkoni, Tal. 2022. “The Generalizability Crisis.” Behavioral and Brain Sciences 45 (January): e1. https://doi.org/10.1017/S0140525X20001685. +

Footnotes

    @@ -2980,12 +3628,15 @@

    to do

    icon: icon }; anchorJS.add('.anchored'); - const clipboard = new window.ClipboardJS('.code-copy-button', { - target: function(trigger) { - return trigger.previousElementSibling; + const isCodeAnnotation = (el) => { + for (const clz of el.classList) { + if (clz.startsWith('code-annotation-')) { + return true; + } } - }); - clipboard.on('success', function(e) { + return false; + } + const onCopySuccess = function(e) { // button target const button = e.trigger; // don't keep focus @@ -3017,11 +3668,50 @@

    to do

    }, 1000); // clear code selection e.clearSelection(); + } + const getTextToCopy = function(trigger) { + const codeEl = trigger.previousElementSibling.cloneNode(true); + for (const childEl of codeEl.children) { + if (isCodeAnnotation(childEl)) { + childEl.remove(); + } + } + return codeEl.innerText; + } + const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', { + text: getTextToCopy }); - function tippyHover(el, contentFn) { + clipboard.on('success', onCopySuccess); + if (window.document.getElementById('quarto-embedded-source-code-modal')) { + // For code content inside modals, clipBoardJS needs to be initialized with a container option + // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860) + const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', { + text: getTextToCopy, + container: window.document.getElementById('quarto-embedded-source-code-modal') + }); + clipboardModal.on('success', onCopySuccess); + } + var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//); + var mailtoRegex = new RegExp(/^mailto:/); + var filterRegex = new RegExp('/' + window.location.host + '/'); + var isInternal = (href) => { + return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href); + } + // Inspect non-navigation links and adorn them if external + var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)'); + for (var i=0; ito do interactive: true, interactiveBorder: 10, theme: 'quarto', - placement: 'bottom-start' + placement: 'bottom-start', }; + if (contentFn) { + config.content = contentFn; + } + if (onTriggerFn) { + config.onTrigger = onTriggerFn; + } + if (onUntriggerFn) { + config.onUntrigger = onUntriggerFn; + } window.tippy(el, config); } const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]'); @@ -3044,9 +3743,245 @@

    to do

    try { href = new URL(href).hash; } catch {} const id = href.replace(/^#\/?/, ""); const note = window.document.getElementById(id); - return note.innerHTML; + if (note) { + return note.innerHTML; + } else { + return ""; + } }); } + const xrefs = window.document.querySelectorAll('a.quarto-xref'); + const processXRef = (id, note) => { + // Strip column container classes + const stripColumnClz = (el) => { + el.classList.remove("page-full", "page-columns"); + if (el.children) { + for (const child of el.children) { + stripColumnClz(child); + } + } + } + stripColumnClz(note) + if (id === null || id.startsWith('sec-')) { + // Special case sections, only their first couple elements + const container = document.createElement("div"); + if (note.children && note.children.length > 2) { + container.appendChild(note.children[0].cloneNode(true)); + for (let i = 1; i < note.children.length; i++) { + const child = note.children[i]; + if (child.tagName === "P" && child.innerText === "") { + continue; + } else { + container.appendChild(child.cloneNode(true)); + break; + } + } + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(container); + } + return container.innerHTML + } else { + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(note); + } + return note.innerHTML; + } + } else { + // Remove any anchor links if they are present + const anchorLink = note.querySelector('a.anchorjs-link'); + if (anchorLink) { + anchorLink.remove(); + } + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(note); + } + // TODO in 1.5, we should make sure this works without a callout special case + if (note.classList.contains("callout")) { + return note.outerHTML; + } else { + return note.innerHTML; + } + } + } + for (var i=0; i res.text()) + .then(html => { + const parser = new DOMParser(); + const htmlDoc = parser.parseFromString(html, "text/html"); + const note = htmlDoc.getElementById(id); + if (note !== null) { + const html = processXRef(id, note); + instance.setContent(html); + } + }).finally(() => { + instance.enable(); + instance.show(); + }); + } + } else { + // See if we can fetch a full url (with no hash to target) + // This is a special case and we should probably do some content thinning / targeting + fetch(url) + .then(res => res.text()) + .then(html => { + const parser = new DOMParser(); + const htmlDoc = parser.parseFromString(html, "text/html"); + const note = htmlDoc.querySelector('main.content'); + if (note !== null) { + // This should only happen for chapter cross references + // (since there is no id in the URL) + // remove the first header + if (note.children.length > 0 && note.children[0].tagName === "HEADER") { + note.children[0].remove(); + } + const html = processXRef(null, note); + instance.setContent(html); + } + }).finally(() => { + instance.enable(); + instance.show(); + }); + } + }, function(instance) { + }); + } + let selectedAnnoteEl; + const selectorForAnnotation = ( cell, annotation) => { + let cellAttr = 'data-code-cell="' + cell + '"'; + let lineAttr = 'data-code-annotation="' + annotation + '"'; + const selector = 'span[' + cellAttr + '][' + lineAttr + ']'; + return selector; + } + const selectCodeLines = (annoteEl) => { + const doc = window.document; + const targetCell = annoteEl.getAttribute("data-target-cell"); + const targetAnnotation = annoteEl.getAttribute("data-target-annotation"); + const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation)); + const lines = annoteSpan.getAttribute("data-code-lines").split(","); + const lineIds = lines.map((line) => { + return targetCell + "-" + line; + }) + let top = null; + let height = null; + let parent = null; + if (lineIds.length > 0) { + //compute the position of the single el (top and bottom and make a div) + const el = window.document.getElementById(lineIds[0]); + top = el.offsetTop; + height = el.offsetHeight; + parent = el.parentElement.parentElement; + if (lineIds.length > 1) { + const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]); + const bottom = lastEl.offsetTop + lastEl.offsetHeight; + height = bottom - top; + } + if (top !== null && height !== null && parent !== null) { + // cook up a div (if necessary) and position it + let div = window.document.getElementById("code-annotation-line-highlight"); + if (div === null) { + div = window.document.createElement("div"); + div.setAttribute("id", "code-annotation-line-highlight"); + div.style.position = 'absolute'; + parent.appendChild(div); + } + div.style.top = top - 2 + "px"; + div.style.height = height + 4 + "px"; + div.style.left = 0; + let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter"); + if (gutterDiv === null) { + gutterDiv = window.document.createElement("div"); + gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter"); + gutterDiv.style.position = 'absolute'; + const codeCell = window.document.getElementById(targetCell); + const gutter = codeCell.querySelector('.code-annotation-gutter'); + gutter.appendChild(gutterDiv); + } + gutterDiv.style.top = top - 2 + "px"; + gutterDiv.style.height = height + 4 + "px"; + } + selectedAnnoteEl = annoteEl; + } + }; + const unselectCodeLines = () => { + const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"]; + elementsIds.forEach((elId) => { + const div = window.document.getElementById(elId); + if (div) { + div.remove(); + } + }); + selectedAnnoteEl = undefined; + }; + // Handle positioning of the toggle + window.addEventListener( + "resize", + throttle(() => { + elRect = undefined; + if (selectedAnnoteEl) { + selectCodeLines(selectedAnnoteEl); + } + }, 10) + ); + function throttle(fn, ms) { + let throttle = false; + let timer; + return (...args) => { + if(!throttle) { // first call gets through + fn.apply(this, args); + throttle = true; + } else { // all the others get throttled + if(timer) clearTimeout(timer); // cancel #2 + timer = setTimeout(() => { + fn.apply(this, args); + timer = throttle = false; + }, ms); + } + }; + } + // Attach click handler to the DT + const annoteDls = window.document.querySelectorAll('dt[data-target-cell]'); + for (const annoteDlNode of annoteDls) { + annoteDlNode.addEventListener('click', (event) => { + const clickedEl = event.target; + if (clickedEl !== selectedAnnoteEl) { + unselectCodeLines(); + const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active'); + if (activeEl) { + activeEl.classList.remove('code-annotation-active'); + } + selectCodeLines(clickedEl); + clickedEl.classList.add('code-annotation-active'); + } else { + // Unselect the line + unselectCodeLines(); + clickedEl.classList.remove('code-annotation-active'); + } + }); + } const findCites = (el) => { const parentEl = el.parentElement; if (parentEl) { @@ -3090,4 +4025,5 @@

    to do

    + \ No newline at end of file