-
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path01-QCing.qmd
112 lines (83 loc) · 4.04 KB
/
01-QCing.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "QCing tables with ARDs"
editor: source
format:
html:
page-layout: full
code-fold: true
code-summary: "Show the code"
code-overflow: scroll
df-print: kable
---
### QCing rtables
You are likely familiar with the r packages developed by NEST for [generating tables and graphs](https://insightsengineering.github.io/tlg-catalog/stable/).
A common practice to ensure quality control (QC) of the calculated statistics is double programming - submitting the *same data* in a *different programmatic source* to ensure the results align.
Historically, these tables are compared to outputs produced by SAS statistical software.
Here, we offer an alternative using the Analysis Results Datasets (ARDs). The CDISC Analysis Results Standard aims to facilitate automation, reproducibility, reusability, and traceability of analysis results data (ARD).
The {cards} and {cardx} packages can be used to create these analysis datasets.
### Recommended QCing workflow
Below we provide an example workflow for QCing tables.
#### Generate a table using {chevron}
```{r}
#| message: false
#| code-summary: "Show the code"
library(chevron)
# Create a table using the chevron package
dmt01 <- chevron::run(dmt01, syn_data)
dmt01
```
#### Flatten the table into a data.frame
A rtables based output can be flattened into a data frame using the `as_results_df()` function from the rtables package. The `make_ard` argument set to `TRUE`, will format the data similar to the output generated by the {cards} package.
```{r}
results <- rtables::as_result_df(dmt01, make_ard = TRUE, expand_colnames = TRUE)
results[1:6, -c(1:3)]
```
#### Create a comparable ARD
Using the {cards} package, we stack the functions `ard_continuous()` for the continuous variables and `ard_categorical()` for categorical variables.
The default statistics calculated for each of these data types are included - these can be adapted for bespoke analyses.
```{r}
library(cards)
# build ARDs that calculate relevant statistics for continuous and categorical variables.
ards <- ard_stack(syn_data$adsl, ard_continuous(variables = c(AGE), statistic = ~ continuous_summary_fns(c("N", "mean", "sd", "median", "min", "max"))
),
ard_categorical(variables = c(AGEGR1, SEX, ETHNIC, RACE)),
.by = "ARM",
.overall = TRUE)
ards [1:6, -c(1,9:11)]
```
#### Visualize statistics comparison
With both data frames containing similar key variables (group_level, variable_label, etc.), statistics can be compared side-by-side by combining the tables.
```{r}
# rework the rtables output to match ----
# rename the group2 to group1, rename stat_name from n to N
reformat <- results |>
dplyr::select(-c(1:2,7)) |>
dplyr::rename("group1" = group2, "group1_level" = group2_level, "stat_rtables" = stat) |>
dplyr::mutate(
variable_level = sub("^[^.]*\\.", "", variable_level), # use variable_label
stat_name = dplyr::recode(stat_name, "n" = "N", "count" = "n"),
variable = dplyr::recode(variable, "AAGE" = "AGE"),
variable_level = dplyr::recode(variable_level, "Male" = "M", "Female" = "F"),
variable_level = dplyr::case_when(
variable_level %in% c("mean_sd", "median", "range", "n") ~ NA_character_,
TRUE ~ variable_level
)
)
ards2 <- ards |>
dplyr::mutate(
group1_level = purrr::map_chr(group1_level, ~ ifelse(length(.x) > 0, as.character(.x[[1]]), NA_character_)),
variable_level = purrr::map_chr(variable_level, ~ ifelse(length(.x) > 0, as.character(.x[[1]]), NA_character_)),
group1 = dplyr::coalesce(group1, "ARM"),
group1_level = dplyr::coalesce(group1_level, "All Patients")
) |>
dplyr::select(-c(9:11))
# perform a left join
compare <- dplyr::left_join(ards2, reformat, by = c("group1_level","group1", "variable", "variable_level", "stat_name"))
compare[1:20,]
```
#### Compare programmatically
With all stats aligned, functions such as `all.equal()` or `identical` can be used to verify if the stats produced are equivalent.
```{r}
na.omit(compare) %>%
dplyr::mutate(columns_match = stat == stat_rtables)
```