forked from MathiasHarrer/Doing-Meta-Analysis-in-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path06-Subgroup_Analyses.Rmd
176 lines (114 loc) · 12.1 KB
/
06-Subgroup_Analyses.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
# Subgroup Analyses {#subgroup}
![](_figs/subgroup.jpg)
In [Chapter 6](#heterogeneity), we discussed in depth why **between-study heterogeneity** is such an important issue when we are interpreting the results of our meta-analysis, and how we can **explore sources of heterogeneity** using [outlier](#outliers) and [influence analyses](#influenceanalyses).
Another source of between-study heterogeneity making our effect size estimate less precise could be that **there are slight differences in the study design or intervention components between the studies**. For example, in a meta-analysis on the effects of **cognitive behavioral therapy** (CBT) for **depression** in **university students**, it could be the case that some studies delivered the intervention in a **group setting**, while others delivered the therapy to each student **individually**. In the same example, it is also possible that studies used different **criteria** to determine if a student suffers from **depression** (e.g. they either used diagnostic interviews or self-report questionnaires).
Many other differences of this sort are possible, and it seems plausible that such study differences may also be associated with differences in the overall effect.
In **subgroup analyses**, we therefore have a look at different **subgroups within the studies of our meta-analysis** and try to determine if they **differ in terms of their effects**.
```{block,type='rmdinfo'}
**The idea behind subgroup analyses**
Boiled down, every subgroup analysis consists of **two parts**: (1) **pooling the effect of each subgroup**, and (2) **comparing the effects of the subgroups** [@borenstein2013meta].
**1. Pooling the effect of each subgroup**
This point is rather straightforward, as the same criteria as the ones for a **simple meta-analysis without subgroups** (see [Chapter 4](#pool) and [Chapter 4.2](#random)) apply here.
* If you assume that **all studies in subgroup** stem from the same population, and all have **one shared true effect**, you may use the **fixed-effect-model**. As we mention in [Chapter 4](#pool), many **doubt** that this assumption is ever **true in psychological** and **medical research**, even when we partition our studies into subgroups.
* The alternative, therefore, is to use a **random-effect-model** which assumes that the studies within a subgroup are drawn from a **universe** of populations, for which we want to estimate the **mean**.
**2. Comparing the effects of the subgroups**
After we calculated the pooled effect for each subgroup, **we can compare the size of the effects of each subgroup**. However, to know if this difference is in fact singnificant and/or meaningful, we have to calculate the **Standard Error of the differences between subgroup effect sizes** $SE_{diff}$, to calculate **confidence intervals** and conduct **significance tests**. There are **two ways to calculate** $SE_{diff}$, and both are based on different assumptions.
* **Fixed-effects (plural) model**: The fixed-effects-model for subgroup comparisons is appropriate when **we are only interested in the subgroups at hand** [@borenstein2013meta]. This is the case when **the subgroups we chose to examine** were not randomly "chosen", but represent fixed levels of a characteristic we want to examine. Sex is such a characteristic, as its two subgroups **female** and **male** were not randomly chosen, but are the two subgroups that sex (in its "classical" conception) has. Same does also apply, for example, if we were to examine if studies in patients with **clinical depression** versus **subclinical depression** yield different effects. Borenstein and Higgins [@borenstein2013meta] argue that the **fixed-effects (plural) model** may be the **only plausible model** for most analyses in **medical research, prevention, and other fields**.
As this model assumes that **no further sampling error is introduced at the subgroup level** (because subgroups were not randomly sampled, but are fixed), $SE_{diff}$ only depends on the *variance within the subgroups* $A$ and $B$, $V_A$ and $V_B$.
$$V_{Diff}=V_A + V_B$$
The fixed-effects (plural) model can be used to test differences in the pooled effects between subgroups, while the pooling **within the subgroups is still conducted using a random-effects model**. Such a combination is sometimes called a **mixed-effects model**. We will show you how to use this model in R in the [next chapter](#mixed).
* **Random-effects model**: The random-effects-model for between-subgroup-effects is appropriate when the **subgroups we use were randomly sampled from a population of subgroups**. For example, we could be interested in the question if the effect of an intervention **varies by region** by looking at studies from 5 different countries (e.g., Netherlands, USA, Australia, China, Argentina). This variable "region" has many different potential subgroups (countries), from which we randomly selected five. This means that we introduced a **new sampling error**, which we have to control for using the **random-effects model** for between-subgroup comparisons.
The (simplified) formula for the estimation of $V_{Diff}$ using this model therefore looks like this:
$$V_{Diff}=V_A + V_B + \frac{\hat T^2_G}{m} $$
Where $\hat T^2_G$ is the **estimated variance between the subgroups**, and $m$ is the **number of subgroups**.
```
```{block,type='rmdachtung'}
Be aware that subgroup analyses should **always be based on an informed, *a priori* decision** which subgroup differences within the study might be **practically relevant**, and would lead to information gain on relevant **research questions** in your field of research. It is also **good practice** to specify your subgroup analyses **before you do the analysis**, and list them in **the registration of your analysis**.
It is also important to keep in mind that **the capabilites of subgroup analyses to detect meaningful differences between studies is often limited**. Subgroup analyses also need **sufficient power**, so it makes no sense to compare two or more subgroups when your entire number of studies in the meta-analysis is smaller than $k=10$ [@higgins2004controlling].
```
<br><br>
---
## Subgroup Analyses using the Mixed-Effects-Model {#mixed}
```{r,echo=FALSE, message=FALSE}
library(meta)
```
To conduct subgroup analyses using the **Mixed-Effects Model** (random-effects model within subgroups, fixed-effects model between subgroups), you can use the `subgroup.analysis.mixed.effects` function we prepared for you. This function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it from your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you do not want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/subgroup_analyses_mixed_effects_function.R). In this case, *R* does not know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **Console** on the bottom left pane of RStudio, and then hit **Enter ⏎**. The function requires the `meta` package to work.
```{r,echo=FALSE}
source("dmetar/subgroup_analyses_mixed_effects_function.R")
```
For the `subgroup.analysis.mixed.effects` function, the following parameters have to be set:
```{r,echo=FALSE}
library(knitr)
Code<-c("x","subgroups", "exclude", "plot")
Description<-c("An object of class meta, generated by the metabin, metagen, metacont, metacor, metainc, or metaprop function.",
"A character vector of the same length as the number of studies within the meta-analysis, with a unique code for the subgroup each study belongs to. Must have the same order as the studies in the meta object.",
"Single string or concatenated array of strings. The name(s) of the subgroup levels to be excluded from the subgroup analysis. If 'none' (default), all subgroup levels are used for the analysis.",
"Logical. Should a forest plot for the mixed-effect subgroup analysis be generated? Calls the forest.meta function internally. TRUE by default."
)
m<-data.frame(Code,Description)
names<-c("Code","Description")
colnames(m)<-names
kable(m)
```
In my `madata` dataset, which i used previously to generate my meta-analysis output `m.hksj`, i stored the subgroup variable `Control`. This variable specifies **which control group type was employed in which study**. There are **three subgroups**: `WLC` (waitlist control), `no intervention` and `information only`.
The function to do a subgroup analysis using the mixed-effects-model with these parameters looks like this.
```{r,echo=FALSE}
load("_data/Meta_Analysis_Data.RData")
madata = Meta_Analysis_Data
m.hksj<-metagen(TE, seTE, data=madata, method.tau = "SJ", hakn = TRUE, studlab = paste(Author), comb.random = TRUE)
```
```{r,message=FALSE,warning=FALSE, fig.height=8, fig.align="center"}
subgroup.analysis.mixed.effects(x = m.hksj,
subgroups = madata$Control)
```
The results of the subgroup analysis are displayed under `Subgroup Results`. We also see that, while the **pooled effects of the subgroups differ quite substantially** ($g$ = 0.41-0.78), this difference is **not statistically significant**.
This can be seen under `Test for subgroup differences (mixed/fixed-effects (plural) model)` in the `Between groups` row. We can see that $Q=3.03$ and $p=0.2196$. This information can be reported in our meta-analysis paper. We can also produce a forest plot for the subgroup analysis using `forest`.
```{r, fig.height=8, fig.align="center", eval=F}
sgame <- subgroup.analysis.mixed.effects(x = m.hksj,
subgroups = madata$Control)
forest(sgame)
```
```{r, fig.height=8, fig.align="center", echo=F}
sgame <- subgroup.analysis.mixed.effects(x = m.hksj,
subgroups = madata$Control)
dmetar:::forest.subgroup.analysis.mixed.effects(sgame)
```
<br><br>
---
## Subgroup Analyses using the Random-Effects-Model
```{r,echo=FALSE}
region<-c("Netherlands","Netherlands","Netherlands","USA","USA","USA","USA","Argentina","Argentina","Argentina","Australia","Australia","Australia","China","China","China","China","China")
madata$region<-region
```
Now, let us assume I want to **know if intervention effects in my meta-analysis differ by region**. I use a **random-effects-model** and the selected coutries Argentina, Australia, China and the Netherlands.
Again, I use the `m.hksj` meta-analysis output object. I can perform a random-effects-model for between-subgroup-differences using the `update.meta` function. For this function, we have to **set two parameters**.
```{r,echo=FALSE}
library(knitr)
Code<-c("byvar","comb.random")
Description<-c("Here, we specify the variable in which the subgroup of each study is stored","Weather we want to use a random-effects-model for between-subgroup-differences. In this case, we have to set comb.random = TRUE")
m<-data.frame(Code,Description)
names<-c("Code","Description")
colnames(m)<-names
kable(m)
```
```{r,echo=FALSE}
m.hksj<-metagen(TE, seTE, data=madata, method.tau = "SJ", hakn = TRUE, studlab = paste(Author), comb.random = TRUE)
```
```{r,warning=FALSE,message=FALSE}
region.subgroup<-update.meta(m.hksj,
byvar=region,
comb.random = TRUE,
comb.fixed = FALSE)
region.subgroup
```
Here, we get the **pooled effect for each subgroup** (country). Under `Test for subgroup differences (random effects model)`, we can see the **test for subgroup differences using the random-effects-model**, which is **not significant** ($Q=4.52$,$p=0.3405$). This means that we did not find differences in the overall effect between different regions, represented by the country in which the study was conducted.
```{block,type='rmdachtung'}
**Using a fixed-effect-model for within-subgroup-pooling and a fixed-effects-model for between-subgroup-differences**
To use a fixed-effect-model in combination with a fixed-effects-model, we can also use the `update.meta` function again. The procedure is the same as the one we described before, but we have to set `comb.random` as `FALSE` and `comb.fixed` as `TRUE`.
```
<br><br>
---