-
Notifications
You must be signed in to change notification settings - Fork 2
/
README.Rmd
88 lines (65 loc) · 3.89 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Virtual patient simulation using copula modeling
These scripts were used to study the suse of copulas for simulation of virutal patient sets.
### Scripts
The folder scripts contains the following files:
- **simulation_comparison.R**, contains the data prepation of the pediatric data and the analysis of a three-covariate simulation for five different simulation techniques (Figure 2a)
- **simulation_comparison_12d.R**, uses prepared data from simulation_comparison.R and simulates 12 covariates using the same five simulation techniques (Figure 2b)
- **PK_model_simulation.R**, uses prepared data from simulation_comparison.R. The three-covariate simulations are used to predict vancomycin PK profiles (Figure 3a and 3b)
- **data_preparation_pregnancy.R**, prepares pregnancy data for the time-dependent covariates
- **longitudinal_copulas.R**, uses prepared data from data_preparation_pregnancy.R (Figure 4).
- **data_preparation_MIMIC.R**, extracts columns of interest from the large MIMIC database. *requires large memory*
- **MIMIC_copula.R**, uses prepared data from data_preparation_MIMIC.R estimates and visualizes simulations from a copula on the MIMIC data (Figure 5 and S2). *requires large memory*
#### Functions
Above scripts depend on certain functions written in separate files.
- **functions.R**, contais a set of helpers functions used throughout the project: `create_colors`, `tranform_to_uniform`, `estimate_spline_marginal`, `get_statistics_multiple_sims` and `get_statistics`.
- **estimate_vinecopula.R**, contains a wrapper function for the `rvinecopula` package for estimation of copulas. Can be used on untrasformed data and uses kernel density estimation for the marginal distributions. Creates an object which can be used to simulate new covariate sets.
- **Smania_Jonsson_MICE_simulation.R**, retrieved from the article Smania & Jonsson (2021). Contains function for conditional distribution simulation.
- **run_Grimsley.R**, contains the `run_grimsley` function which implements the vancomycin PK model from Grimsley & Thomson (1999). Used for PK_model_simulation.R.
- **plot_distributions.R**, plot the contours from simulated and observed data. Used for MIMIC_copula.R -\> Figure 5 and S2.
### Data
The copulas were added which can be used to simulate data and explore the data underlying the study, without sharing the underlying data. Use below code to load and start simulating.
```{r load_functions, warning = FALSE, message = FALSE}
library(rvinecopulib)
library(tidyverse)
source("scripts/functions/estimate_vinecopula_from_data.R")
source("scripts/functions/functions.R")
source("scripts/functions/plot_distributions.R")
```
```{r set_seed, echo = FALSE}
set.seed(1234)
```
#### Pediatric data
```{r pediatrics, fig.width = 4, fig.height = 3, fig.align = "center"}
load("copulas/pediatric_copula.Rdata")
df_sim_pediatric <- simulate(large_cop, n = 20, value_only = FALSE)
ggplot(data = df_sim_pediatric) +
geom_point(aes(x = age, y = BW), color = "#3ABAC1") +
theme_bw()
```
#### Longitudinal data
```{r longitudinal, fig.width = 4, fig.height = 3, fig.align = "center"}
load("copulas/longitudinal_copula.Rdata")
df_sim_longitudinal <- simulate(copula_long, n = 20, value_only = FALSE)
ggplot(data = df_sim_longitudinal$values) +
geom_line(aes(x = gest, y = Platelets, group = ID), color = "#3ABAC1") +
theme_bw()
```
#### MIMIC data
The MIMIC data can be retrieved from <https://physionet.org/content/mimiciv/1.0/>.
```{r MIMIC, fig.width = 4, fig.height = 3, fig.align = "center"}
load("copulas/mimic_copula.Rdata")
df_sim_mimic <- simulate(cop_mimic, n = 20, value_only = FALSE)
ggplot(data = df_sim_mimic) +
geom_point(aes(x = HDL, y = Triglyceride), color = "#3ABAC1") +
theme_bw()
```