forked from SwissClinicalTrialOrganisation/stats_platform
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsoftware.qmd
558 lines (366 loc) · 21.4 KB
/
software.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
---
title: "Software packages"
---
Working with statistical software is the daily business of our statisticians. Most software languages allow their users to create their own packages of custom functions to reduce errors in repeated tasks. The software used by SCTO statisticians, primarily R and Stata, are no different in this respect. This page provides an overview of some.
<!-- packages are listed in alphabetical order -->
# SCTO funded packages
The SCTO Statistics and Methodology platform offers grants to associated statistics specifically for the development of such statistical packages, either for the development of completely new software, or the further development of existing software.
## `presize` - precision based sample size estimation
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/presize) [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/presize/) [![](https://www.r-pkg.org/badges/version/presize?color=green)](https://cran.r-project.org/package=presize) [![](https://joss.theoj.org/papers/10.21105/joss.03118/status.svg)](https://doi.org/10.21105/joss.03118)
`presize` is an R package for precision based sample size calculation. It provides a large number of methods for estimating the number of samples required to gain a confidence interval of a given width, or the width that might be expected with a given sample size.
<details>
<summary>Example</summary>
Assuming that we want to estimate the confidence interval (CI) around the sensitivity of a test, but we're not sure of the sensitivity, we can estimate the CI width in a range of scenarios as follows.
```{r}
#| message: false
#| code-fold: true
library(presize)
# set up a range of scenarios
scenarios <- expand.grid(sens = seq(.5, .95, .1),
prev = seq(.1, .2, .04),
ntot = c(250, 350))
# calculate the CI width at ntot individuals with prev prevalence of event
scenario_data <- prec_sens(sens = scenarios$sens,
prev = scenarios$prev,
ntot = scenarios$ntot,
method = "wilson")
# plot the scenarios with ggplot2
scenario_df <- as.data.frame(scenario_data)
library(ggplot2)
ggplot(scenario_df,
aes(x = sens,
y = conf.width,
# convert colour to factor for distinct colours rather than a continuum
col = as.factor(prev),
group = prev)) +
geom_line() +
labs(x = "Sensitivity", y = "CI width", col = "Prevalence") +
facet_wrap(vars(ntot))
```
</details>
For ease of use, `presize` also includes a shiny app for point-and-click use, which is also available on the internet.
<details>
<summary>Installation</summary>
`presize` can be installed in R via the following methods:
# from CRAN (the stable version)
install.packages("presize")
# from CTU Bern's package universe (the development version)
install.packages("presize", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `redcaptools` - a package for working with REDCap data in R
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/redcaptools) [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/redcaptools/)
REDCap is a popular database for clinical research, used by many of the CTUs in Switzerland. One aggravation with REDCap data exports is that the data is in one file which can contain a lot of empty cells when more complicated database designs are used. `redcaptools` has tools to automatically pull the database apart into forms for easier use. Similar to `secuTrialR`, it also labels variables, and prepares date and factor variables. The function is primarily for interacting with REDCap via the Application Programming Interface (API), allowing easy scripted exports.
<details>
<summary>Example</summary>
By supplying the API token generated by REDCap, together with the APIs URL, the `redcap_export_byform` function can be used to export all data from the database by form. Each form is returned as an element of a list.
```{r}
#| eval: false
library(redcaptools)
token <- "some-long-string-provided-by-redcap"
url <- "https://link.to.redcap/api/"
dat <- redcap_export_byform(token, url)
```
The 'normal' format can be exported via the `redcap_export_tbl` function:
```{r}
#| eval: false
record_data <- redcap_export_tbl(token, url, "record")
meta <- redcap_export_tbl(token, url, "metadata")
```
This function can also be used to export various other API endpoints (e.g. various types of metadata etc, specific forms).
The data can then be formatted by using the metadata and the `rc_prep` function
```{r}
#| eval: false
prepped <- rc_prep(dat, meta)
```
</details>
<details>
<summary>Installation</summary>
`redcaptools` can be installed in R via the following methods:
# from CTU Bern's package universe (the development version)
install.packages("redcaptools", repos = "https://ctu-bern.r-universe.dev/")
# from github
remotes::install_github("CTU-Bern/redcaptools")
</details>
## `selcorr` - post-selection inference for generalized linear models
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://www.r-pkg.org/badges/version/selcorr?color=green)](https://cran.r-project.org/package=selcorr)
`selcorr` calculates (unconditional) post-selection confidence intervals and p-values for the coefficients of (generalized) linear models.
<details>
<summary>Example</summary>
```{r}
#| eval: false
library(selcorr)
## linear regression:
selcorr(lm(Fertility ~ ., swiss))
## logistic regression:
swiss.lr = within(swiss, Fertility <- (Fertility > 70))
selcorr(glm(Fertility ~ ., binomial, swiss.lr))
```
A parallel bootstrapping approach is also available.
```{r}
#| eval: false
#| code-fold: true
library(future.apply)
plan(multisession)
boot.repl = future_replicate(8, selcorr(lm(Fertility ~ ., swiss), boot.repl = 1000,
quiet = TRUE)$boot.repl, simplify = FALSE)
plan(sequential)
selcorr(lm(Fertility ~ ., swiss), boot.repl = do.call("rbind", boot.repl))
```
</details>
<details>
<summary>Installation</summary>
`selcorr` can be installed in R from CRAN:
# from CRAN (the stable version)
install.packages("selcorr")
</details>
## `sse` - sample size estimation
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/thofab/sse) [![](https://img.shields.io/badge/R%20forge-grey.svg)](http://r-forge.r-project.org/projects/power/) [![](https://www.r-pkg.org/badges/version/sse?color=green)](https://cran.r-project.org/package=sse)
`sse` is another R package for sample size calculation that has been in use at CTU Basel for many years. It's approach is very general, allowing a wide range of scenarios to be assessed rapidly. Where `presize` is rather for precision-based calculations, `sse` is rather for hypothesis testing, although it is general enough that it can be used for both frameworks.
<details>
<summary>Example</summary>
We want to find the sample size for comparing two means. We are unsure of the standard deviation to expect, so we assess the sample size across a range of standard deviations. Assuming that a standard deviation of 12 is appropriate in this case, and we want a power of 90%, we can plot the power curve:
```{r}
#| message: false
#| code-fold: true
library(sse)
## defining the range of n and theta to be evaluated
psi <- powPar(
# SD values
theta = seq(from = 5, to = 20, by = 1),
# sample sizes
n = seq(from = 5, to = 50, by = 2),
# group means
muA = 0,
muB = 20)
## define a function to return the power in each scenario
powFun <- function(psi){
power.t.test(n = n(psi)/2,
delta = pp(psi, "muA") - pp(psi, "muB"),
sd = theta(psi)
)$power
}
## evaluate the power-function for all combinations of n and theta
calc <- powCalc(psi, powFun)
## choose one particular example at theta of 1 and power of 0.9
pow <- powEx(calc, theta = 12, power = 0.9)
## drawing the power plot with 3 contour lines
plot(pow,
xlab = "Standard Deviation",
ylab = "Total Sample Size",
at = c(0.85, 0.9, 0.95))
```
</details>
<details>
<summary>Installation</summary>
`sse` can be installed in R via the following methods:
# from CRAN (the stable version)
install.packages("sse")
# from CTU Bern's package universe (the development version)
install.packages("sse", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `sts_graph_landmark` - landmark analysis graphs
![](https://img.shields.io/badge/Language-Stata-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg){fig-align="left"}](https://github.com/CTU-Bern/sts_graph_landmark)
`sts_graph_landmark` is a Stata program to create landmark analysis Kaplan-Meier curves, complete with risk table.
<details>
<summary>Example</summary>
Using `sts_graph_landmark` is consistent with the other `sts_*` programs in Stata. The dataset should be `stset` and then `sts_graph_landmark` can be called specifying the landmark time in `at`.
```{r}
#| eval: false
#| code-fold: true
# load example dataset (note: this example is nonsensical and only for graphing purposes)
webuse stan3, clear
# set data as survival data
stset t1, failure(died) id(id)
# label treatment arms
label define posttran_l 0 "prior transplantation" 1 "after transplantation"
label value posttran posttran_l
# create landmark plot and table
sts_graph_landmark, at(200) by(posttran) risktable
```
![](docs/sts_landmark_graph.png)
</details>
<details>
<summary>Installation</summary>
It can be installed from github:
net install github, from("https://haghish.github.io/github/")
github install CTU-Bern/sts_graph_landmark
</details>
## `secuTrialR` - import secuTrial datasets to R
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/SwissClinicalTrialOrganisation/secuTrialR) [![](https://img.shields.io/badge/Website-blue.svg)](https://swissclinicaltrialorganisation.github.io/secuTrialR/) [![](https://www.r-pkg.org/badges/version/secuTrialR?color=green)](https://cran.r-project.org/package=secuTrialR) [![](https://joss.theoj.org/papers/10.21105/joss.02816/status.svg)](https://doi.org/10.21105/joss.02816)
<!-- because this is technically not a stats package, i put it last, rather than in alphabetical order -->
secuTrial datasets consist of a lot of files and it can be difficult to get to grips with them. `secuTrialR` tries to reduce the burden by providing a method to import and format (e.g. adding labels to variables) and explore data.
<details>
<summary>Example</summary>
Data can be read into R using `read_secuTrial`. The `visit_structure` function gives an idea of which forms are required at which visit. `plot_recruitment` is for plotting trial recruitment.
```{r}
#| message: false
#| layout-nrow: 1
#| code-fold: true
library(secuTrialR)
# prepare path to example export
export_location <- system.file("extdata", "sT_exports", "snames",
"s_export_CSV-xls_CTU05_short_miss_en_utf8.zip",
package = "secuTrialR")
# read all export data
sT_export <- read_secuTrial(data_dir = export_location)
plot(visit_structure(sT_export))
plot_recruitment(sT_export)
```
</details>
`secuTrialR` was developed by the data management platform with substantial input from members of the statistics and methodology platform.
<details>
<summary>Installation</summary>
`secuTrialR` can be installed in R via the following methods:
# from CRAN (the stable version)
install.packages("secuTrialR")
# from CTU Bern's package universe (the development version)
install.packages("secuTrialR", repos = "https://ctu-bern.r-universe.dev/")
</details>
<!-- eventually... -->
<!-- ## `shiny_template` - a template shiny app for use in clinical trials and registries -->
<!-- Rather than a fully blown R package, it provides a template that can be adapted to be used with trial databases. -->
<!-- `shiny_template` was developed by the data management platform with substantial input from members of the statistics and methodology platform. -->
# Other software developed by CTUs
CTU's sometimes also develop software without explicit funding from the SCTO platform. Those packages are listed below.
## `accrualPlot` - simple creation of accrual plots
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/accrualPlot) [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/accrualPlot/) [![](https://www.r-pkg.org/badges/version/accrualPlot?color=green)](https://cran.r-project.org/package=accrualPlot)
`accrualPlot` is an R package for summarizing trial recruitment data. With relatively little code, it is possible to create various plots and tables useful for recruitment reports, as well as predict the end of recruitment based on the recruitment to date.
<details>
<summary>Example</summary>
`accrualPlot` includes a simulated dataset of participants recruited into a trial in one of three sites. The `accrual_create_df` function is used to define the properties of the sites (e.g. start dates if that differs from the first participants recruitment date). The plot and summary functions can then be used to plot or tabulate the data. The data can be plot using either base graphics or `ggplot2`.
```{r}
#| code-fold: true
#| layout-ncol: 2
#| layout-nrow: 2
#| message: false
library(accrualPlot)
data(accrualdemo)
df <- accrual_create_df(accrualdemo$date, by = accrualdemo$site)
# cumulative recruitment
plot(df, which = "cum", engine = "ggplot2")
# absolute recruitment (daily/weekly/monthly)
plot(df, which = "abs", engine = "ggplot2")
# predict end date
plot(df, which = "pred", target = 300, engine = "ggplot2")
# summary table
library(gt)
gt(summary(df)) %>%
tab_options(column_labels.hidden = TRUE)
```
</details>
<details>
<summary>Installation</summary>
`accrualPlot` can be installed in R via the following methods:
# from CRAN (the stable version)
install.packages("accrualPlot")
# from CTU Bern's package universe (the development version)
install.packages("accrualPlot", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `btable` - create baseline tables in Stata
![](https://img.shields.io/badge/Language-Stata-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg){fig-align="left"}](https://github.com/CTU-Bern/btable)
Creating baseline tables is a repetitive task. Each paper needs one. `btable` provides a powerful approach to creating them. See the [making baseline tables article for an example](baselinetables.qmd#stata-btable). More information on `btable` can be found [here](https://github.com/CTU-Bern/btable){target="_blank\" rel"}.
<details>
<summary>Installation</summary>
`btable` can be installed in Stata via the following method:
net install github, from("https://haghish.github.io/github/")
github install CTU-Bern/btable
</details>
## `btabler` - format tables for LaTeX reports
![](https://img.shields.io/badge/Language-R-red.svg) [![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/btabler){target="_blank\" rel"} [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/btabler/){target="_blank"}
`btabler` adds additional functionality to the `xtable` package such as merging column headers for use in tables for LaTeX. It was originally developed as an easy way to put tables generated by \`btable\` into LaTeX reports, hence the similarity in names.
<details>
<summary>Example</summary>
```{r}
#| eval: false
library(btabler)
df <- data.frame(name = c("", "", "Row 1", "Row2"),
out_t = c("Total", "mean (sd)", "t1", "t1"),
out_1 = c("Group 1", "mean (sd)", "g11", "g12"),
out_2 = c("Group 2", "mean (sd)", "g21", "g22"))
btable(df, nhead = 2, nfoot = 0, caption = "Table1")
```
Which will look like this in after LaTeX has created your PDF:
![](docs/btabler_basic.png)
</details>
<details>
<summary>Installation</summary>
`btabler` can be installed in R via the following method:
# from CTU Bern's package universe (the development version)
install.packages("btabler", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `HSAr` - create reproducible hospital service areas in R
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/aghaynes/HSAr) [![](https://img.shields.io/badge/Health%20Serv%20Res-10.1111/1475--6773.13275-apple.svg)](https://doi.org/10.1111/1475-6773.13275) [![](https://img.shields.io/badge/PubMed-PMC7240760-apple.svg)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7240760/)
Hospital service areas can be useful for hospital planning, but their main use is in small area research. They are traditionally made largely by hand, by assigning each location to the hospital where most residents go and then iteratively moving locations until two main criteria are fulfilled - a HSA should not have detached islands, and at least 50% of it's hospitalizations should stay there. The iterative steps are largely manual subjective work. As such the reproducibility of HSA creation is poor.
`HSAr` provides an automated algorithm for creating HSAs by starting at the hospital and building the HSA around it until all regions in the provided shapefile are assigned to a HSA.
`HSAr` was developed as part of national research programme 74, smarter health care.
<details>
<summary>Example</summary>
</details>
<details>
<summary>Installation</summary>
`HSAr` can be installed in R via the following method:
# from CTU Bern's package universe (the development version)
install.packages("HSAr", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `kpitools` - tools to assist with risk based management KPIs
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/kpitools) [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/kpitools/)
It is not enough to simply run a trial. ICH GCP E5 also requires risk based monitoring to be performed. `kpitools` provides a set of summary functions and a standardized format for presenting the key performance indicators (KPIs) that are typically defined for risk based monitoring strategies.
<details>
<summary>Example</summary>
It could be that we believe that time of day might be an indicator of data fabrication because it's not possible that participants are randomised at certain times of the day. The `fab_tod` function can help depict that..
```{r}
#| eval: false
library(kpitools)
set.seed(12345)
dat <- data.frame(
x = lubridate::ymd_h("2020-05-01 13") + 60^2*rnorm(40, 0, 3),
mean = rnorm(40, 56, 20),
by = sample(1:4, 40, prob = c(.2,.25,.4,.4), replace = TRUE)
)
dat %>% kpi("mean", kpi_fn_mean, by = "by") %>% plot
dat %>% fab_tod("x")
```
</details>
<details>
<summary>Installation</summary>
`kpitools` can be installed in R via the following method:
# from CTU Bern's package universe (the development version)
install.packages("kpitools", repos = "https://ctu-bern.r-universe.dev/")
</details>
## `stata_secutrial` - some Stata code to do data import and preparation of secuTrial datasets
![](https://img.shields.io/badge/Language-Stata-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg){fig-align="left"}](https://github.com/CTU-Bern/stata_secutrial)
Similar to `secuTrialR` above, `stata_secutrial` provides Stata code to read and prepare secuTrial exports in Stata. It labels variables, formats date variables, adds labels to categorical variables etc, saving each form as a `dta` file for your further use.
<details>
<summary>Example</summary>
Assuming certain folders and globals have been prepared in advance (see [GitHub](https://github.com/CTU-Bern/stata_secutrial) for further information), using `stata_secutrial` may be as simple as entering
do SecuTrial_zip_data_import
into Stata and then navigating to your download when prompted.
</details>
<details>
<summary>Installation</summary>
As `stata_secutrial` is just code rather than a package, you can copy the files from GitHub and use then in you project. Towards the top of the [GitHub page](https://github.com/CTU-Bern/stata_secutrial) is a green `code` button. Click that and choose download ZIP. You can then unzip the files to your working directory.
</details>
## `SwissASR` - simplified annual safety reports with R
![](https://img.shields.io/badge/Language-R-red.svg)
[![](https://img.shields.io/badge/GitHub-silver.svg)](https://github.com/CTU-Bern/SwissASR) [![](https://img.shields.io/badge/Website-blue.svg)](https://ctu-bern.github.io/SwissASR/)
Ethics and regulators often require annual safety reports. `SwissASR` provides a relatively easy way to produce annual safety reports according to the current template available on the SwissMedic(?) website. The function returns a word file with the safety data completed based on the data provided to it. Minimal additional details should then be added by the study team or principal investigator.
<details>
<summary>Example</summary>
</details>
<details>
<summary>Installation</summary>
`SwissASR` can be installed in R via the following method:
# from CTU Bern's package universe (the development version)
install.packages("SwissASR", repos = "https://ctu-bern.r-universe.dev/")
</details>