forked from MathiasHarrer/Doing-Meta-Analysis-in-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
14-Effectsizeconverter.Rmd
448 lines (279 loc) · 20 KB
/
14-Effectsizeconverter.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
# (PART) Helpful Tools {-}
# Effect Size Calculators {#effectsizecalculator}
![](_figs/effect.jpg)
Although the `meta` package can calculate all **individual effect sizes for every study** if we use the `metabin` or `metacont` function, a frequent scenario is that **some papers do not report the effect size data in the right format**. Especially older articles may often only report results of $t$**-tests**, **ANOVAs**, or $\chi^2$**-tests**. If enough data is reported, we can also use **such outcome formats to calculate effect sizes**. This way, we can calculate the **effect size (e.g., Hedges' g)** and the **Standard Error (SE)**, which we can then use in a meta-analysis with **pre-calculated effect sizes** using the `metagen` function (see [Chapter 4.1.1](#pre.calc)).
```{block,type='rmdinfo'}
**Hedges' g**
When dealing with **continuous outcome data**, it is conventional to calculate the **Standardized Mean Difference** (SMD) as an outcome for each study, and as your **summary measure** [@borenstein2011].
A common format to to calculate the SMD in single trials is **Cohen's d** [@cohen1988statistical]. Yet, this summary measure has been **shown to have a slight bias in small studies, for which it overestimates the effect** [@hedges1981distribution].
**Hedges *g* ** is a similar summary measure, but it **controls for this bias**. It uses a slightly different formula to calculate the pooled variance $s_{pooled}$, $s*_{pooled}$. The transformation from *d* to *g* is often performed using the formula by Hedges and Olkin [@hedges1985statistical].
$$g \simeq d\times(1-\frac{3}{4(n_1+n_2)-9}) $$
```
```{block,type='rmdachtung'}
Hedges' g is **commonly used in meta-analysis**, and it's the standard output format in **RevMan**. Therefore, we highly recommend that you also use this measure in you meta-analysis.
In `meta`'s `metabin` and `metacont` function, Hedges' g is automatically calculated for each study if we set `sm="SMD"`. If you use the `metagen` function, however, you should calculate Hedges' g for each study yourself first.
```
To calculate the effect sizes, we will use Daniel Lüdecke's extremely helpful `esc` package [@esc]. So, please **install this package first** using the `install.packages("esc")` command, and then load it in you library.
```{r,message=FALSE}
library(esc)
```
<br><br>
**Here's an overview of all calculators covered in this guide**
1. [Calculating Hedges' *g* from the Mean and SD](#a)
2. [Calculating Hedges' *g* from a regression coefficient](#b)
3. [Calculating an Odd's Ratio from *Chi-square*](#c)
4. [Calculating Hedges' *g* from a one-way ANOVA](#d)
5. [Calculating Hedges' *g* from the Mean and SE](#e)
6. [Calculating Hedges' *g* from a correlation](#f)
7. [Calculating Hedges' *g* from an independent t-test](#g)
8. [Calculating Hedges' *g* from Cohen's *d*](#h)
9. [Calculating effect sizes for studies with multiple comparisons](#i)
10. [Calculating the *Number Needed to Treat* (NNT) from an effect size](#j)
11. [Calculating the Standard Error from a *p*-value and effect size](#k)
<br><br>
## Hedges' g from the Mean and SD {#a}
To calculate Hedges' *g* from the *Mean*, *Standard Deviation*, and $n_{group}$ of both trial arms, we can use the `esc_mean_sd` function with the following parameters.
* `grp1m`: The **mean** of the **first group** (e.g., the intervention).
* `grp1sd`: The **standard deviation** of the **first group**.
* `grp1n`: The **sample size** of the **first group**.
* `grp2m`: The **mean** of the **second group**.
* `grp2sd`: The **standard deviation** of the **second group**.
* `grp2n`: The **sample size** of the **second group**.
* `totalsd`: The **full sample standard deviation**, if the standard deviation for each trial arm is not reported
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_mean_sd(grp1m = 10.3, grp1sd = 2.5, grp1n = 60,
grp2m = 12.3, grp2sd = 3.1, grp2n = 56, es.type = "g")
```
## Hedges' *g* from a regression coefficient {#b}
### Unstandardized regression coefficients
It is also possible to calculate **Hedges' *g* ** from an unstandardized or standardized regression coeffiecent [@lipsey2001practical].
For **unstardardized coefficients**, we can use the `esc_B` function with the following parameters:
* `b`: unstandardized coefficient $b$ (the "treatment" predictor).
* `sdy`: the standard deviation of the dependent variable $y$ (i.e., the outcome).
* `grp1n`: the number of participants in the first group.
* `grp2n`: the number of participants in the second group.
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_B(b=3.3,sdy=5,grp1n = 100,grp2n = 150,es.type = "g")
```
<br><br>
### Standardized regression coefficents
Here, we can use the `esc_beta` function with the follwing parameters:
* `beta`: standardized coefficient $\beta$ (the "treatment" predictor).
* `sdy`: the standard deviation of the dependent variable $y$ (i.e., the outcome).
* `grp1n`: the number of participants in the first group.
* `grp2n`: the number of participants in the second group.
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_beta(beta=0.7, sdy=3, grp1n=100, grp2n=150, es.type = "g")
```
## Odds Ratio from *Chi-square* {#c}
To calculate the **Odds Ratio** (or any other kind of effect size measure) from $\chi^2$ using the `esc_chisq` function with the following paramters:
* `chisq`: The value of Chi-squared (or only `p`)
* `p`: the chi squared p or phi value (or only `chisq`)
* `totaln`: total sample size
* `es.type`: the summary measure (in our case, `"cox.or"`)
<br><br>
**Here's an example**
```{r}
esc_chisq(chisq=9.9,totaln=100,es.type="cox.or")
```
## Hedges' *g* from a one-way ANOVA {#d}
We can also derive the SMD from the $F$-value of a **one-way ANOVA with two groups**. Such ANOVAs can be detected if you look for the **degrees of freedom** ($df$) underneath of $F$. In a one-way ANOVA with two groups, the degrees of freedom should always start with $1$ (e.g. $F_{1,147}=5.31$). The formula for this transformation looks like this [@cohen1992power;@rosnow1996computing;@rosnow2000contrasts]:
$$d = \sqrt{ F(\frac{n_t+n_c}{n_t n_c})(\frac{n_t+n_c}{n_t+n_c-2})}$$
To calculate **Hedges' g** from $F$-values, we can use the `esc_f` function with the following parameters:
* `f`: *F*-value of the ANOVA
* `grp1n`: Number of participants in group 1
* `grp2n`: Number of participants in group 2
* `totaln`: The total number of participants (if the *n* for each group is not reported)
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_f(f=5.04,grp1n = 519,grp2n = 528,es.type = "g")
```
## Hedges' *g* from the Mean and SE {#e}
When calculating **Hedges' g** from the **Mean** and **Standard Error**, we simply make use of the fact that the Standard error is not much more than the **Standard Deviation** when the sample size is taken into account [@thalheimer2002calculate]:
$$SD = SE\sqrt{n_c}$$
We can calculate **Hedges' g** using the `esc_mean` function with the following parameters:
* `grp1m`: The mean of the first group.
* `grp1se`: The standard error of the first group.
* `grp1n`: The sample size of the first group.
* `grp2m`: The mean of the second group.
* `grp2se`: The standard error of the second group.
* `grp2n`: The sample size of the second group.
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_mean_se(grp1m = 8.5, grp1se = 1.5, grp1n = 50,
grp2m = 11, grp2se = 1.8, grp2n = 60, es.type = "g")
```
## Hedges' *g* from a correlation {#f}
For **equally sized groups** ($n_1=n_2$), we can use the following formula to derive the SMD from the pointbiserial **correlation** [@rosenthal1984meta].
$$r_{pb} = \frac{d}{\sqrt{d^2+4}}$$
And this formula for **unequally sized groups** [@aaron1998equating]:
$$r_{pb} = \frac{d}{\sqrt{d^2+ \frac{(N^2-2 \times N)}{n_1 n_2} }}$$
To convert $r_{pb}$ to **Hedges' g**, we can use the `esc_rpb` function with the following parameters:
* `r`: The *r*-value. Either *r* or its *p*-value must be given.
* `p`: The *p*-value of the correlation. Either *r* or its *p*-value must be given.
* `grp1n`: The sample size of group 1.
* `grp2n`: The sample size of group 2.
* `totaln`: Total sample size, if `grp1n` and `grp2n` are not given.
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
```{r}
esc_rpb(r = 0.25, grp1n = 99, grp2n = 120, es.type = "g")
```
## Hedges' *g* from an independent t-test {#g}
The SMD can also be derived from an **independent t-test value** with the following formula [@thalheimer2002calculate]:
$$d = \frac {t(n_1+n_2)}{\sqrt{(n_1+n_2-2)(n_1n_2)}}$$
We can calculate **Hedges' g** from a **t-test** using the `esc_t` function with the following paramters:
* `t`: The t-value of the t-test. Either *t* or its *p*-value must be given.
* `p`: The *p*-value of the t-test. Either *t* or its *p*-value must be given.
* `grp1n`: The sample size of group 1.
* `grp2n`: The sample size of group 2.
* `totaln`: Total sample size, if `grp1n` and `grp2n` are not given.
* `es.type`: the **effect measure** we want to calculate. In our case this is `"g"`. But we could also calculate Cohen's *d* using `"d"`.
<br><br>
**Here's an example**
```{r}
esc_t(t = 3.3, grp1n = 100, grp2n = 150,es.type="g")
```
## Hedges' *g* from Cohen's *d* {#h}
We can also directly correct **Cohen's *d* ** and thus generate **Hedges' g** using the formula by Hedges and Olkin [@hedges1985statistical]:
$$g \simeq d\times(1-\frac{3}{4(n_1+n_2)-9}) $$
This can be done in R using the `hedges_g` function with the following parameters:
* `d`: The value of **Cohen's d**
* `totaln`: the total *N* in the study
```{r}
hedges_g(d = 0.75, totaln = 50)
```
## Multiple comparisons {#i}
![](_figs/stack.jpg)
Many randomized-controlled trials do not only include a single **intervention** and **control group**, but compare the effect of **two or more interventions** to a control group. It might be tempting in such a scenario to **simply include all the comparisons between the intervention groups and control within a study into one meta-analysis**. Yet, researchers should abstain from this practice, as this would mean that the control group is used twice for the meta-analysis, thus **"double-counting"** the participants in the control group. This results in a **unit-of-analysis** error, as the effect size are correlated, and thus not independent, but are treated as if they would stem from independent samples.
**There are two ways to deal with this:**
* Splitting the N of the control group: One method to control for the unit-of-analysis error to some extent would be to **split** the number of participants in the control group between the two intervention groups. So, if your control group has $N=50$ participants, you could divide the control group into two control groups with he same mean and standard deviation, and $N=25$ participants each. After this preparation step, you could calculate the effect sizes for each intervention arm. As this procedure only partially removes the unit of analysis error, it is not generally recommended. A big plus of this procedure, however, is that it makes [**investigations of hetereogeneity**](#heterogeneity) between study arms possible.
* Another option would be to **synthesize the results of the intervention arms** to obtain one single comparison to the control group. Despite its practical limitations (sometimes, this would mean synthesizing the results from extremely different types of interventions), this procedure does get rid of the unit-of-analysis error problem, and is thus recommended from a statistical standpoint. The following calculations will deal with this option.
To synthesize the **pooled effect size data** (pooled Mean, Standard Deviation and N), we have to use the following formula:
$$N_{pooled}=N_1+N_2$$
$$M_{pooled}=\frac{N_1M_1+N_2M_2}{N_1+N_2}$$
$$SD_{pooled} = \sqrt{\frac{(N_1-1)SD^{2}_{1}+ (N_2-1)SD^{2}_{2}+\frac{N_1N_2}{N_1+N_2}(M^{2}_1+M^{2}_2-2M_1M_2)} {N_1+N_2-1}}$$
As these formulae are quite lengthy, we prepared the function `pool.groups` for you, which does the pooling for you automatically. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/pool.groups.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**.
```{r, echo=FALSE}
source("dmetar/pool.groups.R")
```
**To use this function, we have to specifiy the following parameters:**
* `n1`: The N in the first group
* `n2`: The N in the second group
* `m1`: The Mean of the first group
* `m2`: The Mean of the second group
* `sd1`: The Standard Deviation of the first group
* `sd2`: The Standard Deviation of the second grop
**Here's an example**
```{r}
pool.groups(n1=50,
n2=50,
m1=3.5,
m2=4,
sd1=3,
sd2=3.8)
```
```{block, type='rmdinfo'}
**What should i do when an study has more than two intervention groups**
If a study has more than one two intervention groups you want to synthesize (e.g. four arms, with three distinct intervention arms), you can **pool the effect size data for the first two interventions**, and then **synthesize the pooled data you calculated with the data from the third group**.
This is fairly straightforward if you save the output from `pool.groups` as an object, and then use the `$` operator:
```
First, pool the **first** and **second intervention group**. I will save the output as `res`.
```{r}
res <- pool.groups(n1 = 50,
n2 = 50,
m1 = 3.5,
m2 = 4,
sd1 = 3,
sd2 = 3.8)
```
Then, use the pooled data saved in `res` and **pool it with the data from the third group**, using the `$` operator to access the different values saved in `res`.
```{r}
pool.groups(n1 = res$Npooled,
n2 = 60,
m1 = res$Mpooled,
m2 = 4.1,
sd1=res$SDpooled,
sd2 = 3.8)
```
## Number Needed to Treat (*NNT*) {#j}
Effect sizes such as Cohen's $d$ or Hedges' $g$ are often difficult to interpret from a clinical standpoint. What exactly does an effect of $g=0.35$ mean for patients, medical professionals, or other stakeholders? A good way to facilitate the understanding of your results is to calculate the **Number Needed to Treat**, or $NNT$, which signifies how many additional patients must be treated with an intervention or treatment to avoid one additional negative event (e.g., relapse) or tp achieve one additional positive event (e.g., symptom remission, response). There are two ways to calculate the $NNT$ from effect size data such as $d$ or $g$:
* The method by **Kraemer and Kupfer (2006)**, calculates $NNT$ from the Area Under the Curve ($AUC$) defined as the probability that a patient in the treatment has an outcome preferable to one in the control. This method allows to calculate the $NNT$ directly from $d$ or $g$ without any extra variables.
* The method by **Furukawa** calculates the $NNT$ from $d$ using a reasonable estimate of $CER$, in most contexts the assumed response rate in the control group.
Furukawa's method has been shown to be superior in predicting the $NNT$ compared to the Kraemer & Kupfer method [@furukawa2011obtain]. If reasonable assumptions can be made concerning the $CER$, Furukawa's method should therefore be preferred. When **event data** is used, the $CER$ and $EER$ (experimental group event rate) is calculated first, and the standard definition of the $NTT$, $\frac{1}{EER-CER}$, can be applied.
<br></br>
### The `NNT` function
To calculate the $NNT$ using these three method, we prepared the `NNT` function for you. The function is part if the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/NNT.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**.
<br></br>
**Kraemer & Kupfer method**
To use the Kraemer & Kupfer method, we only have to provide the function with an effect size ($d$ or $g$).
```{r, echo=FALSE}
source("dmetar/NNT.R")
```
```{r}
NNT(d = 0.245)
```
<br></br>
**Furukawa's method**
Once we supply a $CER$ value additionally, Furukawa's method is used automatically.
```{r}
NNT(d = 0.245, CER = 0.35)
```
<br></br>
**Binary event data**
We can also provide binary event data to calculate the $NNT$ directly. To do this, we only need the event rate in the experimental group `event.e` and the total sample in the experimental group, `n.e`, and the same information for the control group in `event.c` and `n.c`.
```{r}
NNT(event.e = 10, event.c = 20, n.e = 200, n.c = 200)
```
You can read more about this function in the [documentation](https://dmetar.protectlab.org/reference/NNT.html).
<br></br>
## Standard Error from *p*-value {#k}
When extracting effect sizes from published articles, it can sometimes happen that a study only reports the effect size (e.g., Cohen's $d$), its $p$-value, but nothing more. To pool results in a meta-analysis (for example using the `metagen` function, see Chapter 4), however, we need some metric for the dispersion of the effect size, preferably the **Standard Error** ($SE$).
Standard errors can be calculated from Effect sizes such as Cohen's $d$ or the Risk Ratio ($RR$) using the formula by @altman2011obtain. We have prepared a function called `se.from.p` for you which implements this formula. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/SE_from_p.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `esc` package to work.
<br></br>
**Effect sizes based on a difference**
Assuming we have a study with $N=75$ participants reporting an effect size of $d=0.71$ with $p=0.013$, we can calculate the standard error like this:
```{r, echo=FALSE, message=FALSE}
library(esc)
source("dmetar/se_from_p.R")
```
```{r}
se.from.p(effect.size = 0.71,
p = 0.013,
N = 75,
effect.size.type= "difference")
```
<br></br>
**Effect sizes based on a ratio**
Assuming we have a study with $N=200$ participants reporting an effect size of $OR=0.91$ with $p=0.05$, we can calculate the standard error like this:
```{r chunk 4}
se.from.p(effect.size = 0.91,
p = 0.05,
N = 200,
effect.size.type= "ratio")
```
As you can see from the output, the function automatically calculates the log-transformed effect size and standard error, which is needed for usage in the `metagen` function (see [Chapter 4.3.3](#binarymetagen)).