-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathAdjustedSurvivalCurves.qmd
214 lines (166 loc) · 6.46 KB
/
AdjustedSurvivalCurves.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
---
title: "Adjusted survival curves"
bibliography: references.bib
# format: pdf
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
warning = FALSE)
```
:::{.callout-warning}
This page is under development. Changes should be expected.
:::
# Background
Unadjusted Kaplan-Meier curves from observational data might be biased because of confounding (@COLE200445). This vignette introduces the concept of inverse probability (IP) weighted survival curves.
# Example
We consider a study population of 500 patients who received a medication. 60% of the patients were women. In men, a specific enzyme was measured with a probability of 30%, whereas in women have a 5 times higher chance of having the enzyme. Those with the enzyme were more likely to receive the medication (with a probability of 75%), compared to those without the enzyme (probability of 50%).
- Research question: What is the effect of the medication on death?
- Study design: Cohort study
- Outcome of interest: Time to death or end of follow-up
- Predictor of interest: Medication
- Confounders: Sex, enzyme
The used variables from the data are:
| Variable | Definition | Coding |
| --- | --- | --- |
| female | Sex | 1=Women, 0=Men |
| enzyme | Measured enzyme | 1=Present, 0=Not present |
| medi | Medication | 1=Received, 0=Not received |
| death | Death | 1=Death, 0=Alive |
| fup | Follow-up time | Non-negative number |
## Knowledge from the crystal ball
Because we simulated the data, we know that medication has no effect on mortality.
# Analysis strategy
IP weighting constructs weights which are equal to the probability of an individual's treatment (here: receiving the medication) given observed covariates (here: sex and enzyme) and creates pseudopopulations in which observed covariates are independent of treatment (i.e. no confounding) (@hernan_book).
```{r}
#| echo: true
#| message: false
#| warning: false
# Required packages
library(tidyverse)
library(survey)
library(survival)
library(gtsummary)
library(survminer)
```
```{r}
#| echo: false
#| message: false
#| warning: false
#| results: asis
#| label: sim
set.seed(1)
n <- 500
# 60% percent women
female <- rbinom(n, 1, 0.6)
# Men has the enzyme with a prob of 0.3; women have
# a 5 times higher chance of having the enzym
enzyme <- ifelse(runif(n) < plogis(qlogis(0.3) + log(5) * female), 1, 0)
# Those with enzyme receive medication with prob 0.75
medi <- ifelse(runif(n) < plogis(log(3) * enzyme), 1, 0)
# Hazard of dying: HR=10 of those with enzyme; No effect for medication
haz <- 0.01 * exp(log(10) * enzyme)
# Time to death
time_to_death <- -log(runif(n)) / haz
# Censoring time: Mean duration 20 days
censored <- rweibull(n, 1, 20)
# Follow-up time
fup <- pmin(time_to_death, censored)
# Death
death <- ifelse(time_to_death <= censored, 1, 0)
# Data
data <- data.frame(fup, death, medi, enzyme, female)
```
## Descriptive table
The table below shows a descriptive summary of the study population, by medication.
```{r}
#| echo: true
#| message: false
#| warning: false
#| results: asis
data %>%
tbl_summary(by = medi)
```
## Unadjusted Kaplan-Meier curve and Cox-modelling
A naive (unadjusted) survival analysis of the data reveals the following Kaplan-Meier plot. We conclude that the medication has an effect on survival.
```{r}
#| echo: true
#| message: false
#| warning: false
mod <- survfit(Surv(fup, death) ~ medi, data = data)
ggsurvplot(mod, data = data, palette = c("#CC0000", "black"), censor = FALSE)
```
```{r}
#| echo: true
#| message: false
#| warning: false
mod_cox_unadjusted <- coxph(Surv(fup, death) ~ medi, data = data)
mod_cox_unadjusted
```
An unadjusted Cox proportional hazard model shows that patients with medication have `r round(exp(mod_cox_unadjusted$coef[1]), 1)` higher hazard of death compared to those without medication.
```{r}
#| echo: true
#| message: false
#| warning: false
mod_cox_adjusted <- coxph(Surv(fup, death) ~ medi + enzyme + female, data = data)
mod_cox_adjusted
```
What happens if we adjust for enzym and sex? Then the effect of the medication on death vanishes (hazard ratio=`r round(exp(mod_cox_adjusted$coef[1]), 2)`).
## IPW modelling
An IPW modelling approach constructs treatment weights (here medication) given known covariates (here sex and enzyme) using a logistic regression model.
```{r}
#| echo: true
#| message: false
#| warning: false
#| results: markup
# IPW denominator
mod <- glm(medi ~ female + enzyme, data = data, family = binomial())
data$ipw <- NA
# Probabilty of treatment
data$ipw <- predict(mod, data = data, type = "response")
# Probabilty of non-treatment
data$ipw[data$medi==0] <- 1 - predict(mod, data = data, type = "response")[data$medi == 0]
```
We construct stabilized weights, since they can provide narrower confidence intervals (@hernan_book).
```{r}
#| echo: true
#| message: false
#| warning: false
#| results: markup
# Stabilized weights
mod0 <- glm(medi ~ 1, data = data, family = binomial())
data$ipw0 <- predict(mod0, data = data, type = "response")
data$ipw0[data$medi == 0] <- 1 - predict(mod0, data = data, type = "response")[data$medi == 0]
data$ipw <- data$ipw0 / data$ipw
```
An IPW adjusted Kaplan-Meier curve reveals that medication has no effect on survival:
```{r}
#| echo: true
#| message: false
#| warning: false
#| results: markup
# Set survey design
svy_design <- svydesign(id = ~1, weights = ~ipw, data = data)
# IPW adjusted Kaplan-Meier
km_fit <- svykm(Surv(fup, death) ~ medi, design = svy_design)
km_df <- data.frame(time = km_fit$`1`$time, surv = km_fit$`1`$surv, strata = "medi=1")
km_df <- bind_rows(km_df, data.frame(time=km_fit$`0`$time, surv=km_fit$`0`$surv, strata = "medi=0"))
ggsurvplot_df(km_df, palette = c("#CC0000", "black"), censor=FALSE)
```
```{r}
#| echo: true
#| message: false
#| warning: false
#| results: markup
mod_cox_ipw_adjusted <- svycoxph(Surv(fup, death) ~ medi, design = svy_design)
summary(mod_cox_ipw_adjusted)
```
This is confirmed by an IPW adjusted Cox regression model (hazard ratio=`r round(exp(mod_cox_ipw_adjusted$coef[1]), 2)`).
# Conclusion
Unadjusted Kaplan-Meier curves from observational data might be biased because of confounding. IPW adjusted survival curves account for confounding by constructing weights which are proportional to the probability of treatment given known covariates. An advantage of IPW adjusted Kaplan-Meier curves is that they provide marginal survival estimates, in contrast to stratified plots (@COLE200445).
# Data simulation
```{r}
#| label: sim
#| echo: true
#| eval: false
```
# References