diff --git a/docs/404.html b/docs/404.html deleted file mode 100644 index baaee1b..0000000 --- a/docs/404.html +++ /dev/null @@ -1,148 +0,0 @@ - - - -
- - - - -session_info()
-# please paste here the result of
-devtools::session_info()
-vignettes/Informative_Survival_Plots.Rmd
- Informative_Survival_Plots.Rmd
--This vignette covers changes between versions 0.1.0 and 0.2.0.
-
Hadley Wickham’s ggplot2 version 2.0 revolution, at the end of 2015, triggered many crashes in dependent R packages, that finally led to deletions of few packages from The Comprehensive R Archive Network. It occured that survMisc
package was removed from CRAN on 27th of January 2016 and R world remained helpless in the struggle with the elegant visualizations of survival analysis. Then a new tool - survminer package, created by Alboukadel Kassambara - appeared on the R survival scene to fill the gap in visualizing the Kaplan-Meier estimates of survival curves in elegant grammar of graphics like way. This blog presents main features of core ggsurvplot()
function from survminer package, which creates the most informative, elegant and flexible survival plots that I have seen!
During the development of RTCGA package (about which I wrote here) we encountered a need to provide the simplest possible interface to estimates of survival curves for biotechnologists and the discovery of ggsurvplot()
was a bull’s-eye! Many have tried to provide a package or function for ggplot2-like plots that would present the basic tool of survival analysis: Kaplan-Meier estimates of survival curves, but none of earlier attempts have provided such a rich structure of features and flexibility as survminer. On basis of estimates of survival curves one can infere on differences in survival times between compared groups, so survival plots are very useful and can be seen in almost every publication in the field of survival analysis and time to event models.
After regular installation
-install.packages('survminer') -BiocManager::install("RTCGA.clinical") # data for examples
we can create simple estimates of survival curves just after we put survfit
(survival package) object into ggsurvplot()
function. Let’s have a look at differences in survival times between patients suffering from Ovarian Cancer (Ovarian serous cystadenocarcinoma) and patients suffering from Breast Cancer (Breast invasive carcinoma), where data were collected by The Cancer Genome Atlas Project.
library(survminer) -library(RTCGA.clinical) -survivalTCGA(BRCA.clinical, OV.clinical, - extract.cols = "admin.disease_code") -> BRCAOV.survInfo -library(survival) -fit <- survfit(Surv(times, patient.vital_status) ~ admin.disease_code, - data = BRCAOV.survInfo) -# Visualize with survminer -ggsurvplot(fit, data = BRCAOV.survInfo, risk.table = TRUE)
This simple plot presents, in an elegant way, estimates of survival probability depending on days from cancer diagnostics grouped by cancer types and an informative risk set table, which shows the number of patients that were under observation in the specific period of time. Survival analysis is a specific field of data analysis because of the censored time to event data, so risk set size is a must in visual inference.
-After few improvements (#1, #2, #3, #4, #7, #8, #12, #28), made by Alboukadel in version 0.2.0, one can create a powerful, informative survival plot with such specification of parameters
-ggsurvplot( - fit, # survfit object with calculated statistics. - data = BRCAOV.survInfo, # data used to fit survival curves. - risk.table = TRUE, # show risk table. - pval = TRUE, # show p-value of log-rank test. - conf.int = TRUE, # show confidence intervals for - # point estimaes of survival curves. - xlim = c(0,2000), # present narrower X axis, but not affect - # survival estimates. - break.time.by = 500, # break X axis in time intervals by 500. - ggtheme = theme_minimal(), # customize plot and risk table with a theme. - risk.table.y.text.col = T, # colour risk table text annotations. - risk.table.y.text = FALSE # show bars instead of names in text annotations - # in legend of risk table -)
Each parameter is described in the correspoding comment, but I would like to emphasize the xlim
parameter which controls limits of the X axis but does not affect the survival curves, that are taking into account all possible times. Another brilliant parameter is break.time.by
that affects survival plots and the risk set table - this would not be so easy to create it by yourself. Also a ggtheme
parameter is beneficial for simple plot customization. Finally, risk.table.y.text
and risk.table.y.text.col
(for which I have provided a user requests) are useful parameters that change text (often too long and redundand) from risk table legend into narrow, coloured bar. This saves a lot of space in the final plot.
Even though survMisc
have returned on CRAN, I’ve appreciated survminer so much that I would not look anymore for other solutions. Check why: at the end I present survival curves that can be obtained with the usage of base
package and survMisc
package.
plot(fit) # base
It looks pretty… base…
-plot(fit, col=c("orange","purple"), lty=c(1:2), lwd=3, # base with some customization - conf.int = TRUE, xmax = 2000) -# add a legend -legend(100, .2, c("Ovarian Cancer", "Breast Cancer"), - lty = c(1:2), col=c("orange","purple"))
I haven’t seen examples with risk table and adding legend isn’t so quick as in survminer. Moreover, there are no minor axis breaks lines.
-# install.packages('survMisc') -library(survMisc) -survMisc:::autoplot.survfit(fit) # no customization
Why colours are not asigned to any group? Where is the legend? Why there is so much white space to the right? (Those were questions for the version of survMisc where this was an issue.) Where is OX axis?
-survMisc:::autoplot.survfit( # with some hard customization - fit, - type = "fill", - pVal=TRUE -) -> fit.survMisc -fit.survMisc$table <- fit.survMisc$table + - theme_minimal() + # theme(legend.position = "top") - coord_cartesian(xlim = c(0,2000)) -fit.survMisc$plot <- fit.survMisc$plot + - theme_minimal() + - coord_cartesian(xlim = c(0,2000)) -survMisc:::print.tableAndPlot(fit.survMisc)
Where is risk table? Why I can’t pass break.by.time
to have informative minor X axis breaks? Why the plot gets narrower when the legend in risk table gets wider and why I can’t do anything to workaround this?
--Never mind ->
-install.packages('survminer')
vignettes/Playing_with_fonts_and_texts.Rmd
- Playing_with_fonts_and_texts.Rmd
library("survminer")
--This vignette covers changes between versions 0.2.4 and 0.2.5 for additional texts and fonts customization enabled for subtitles and captions.
-
Compare the basic plot
-library("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Drawing survival curves -ggsurvplot(fit, data = lung)
with the plot where every possible text on a plot
is specified
ggsurvplot(fit, data = lung, - title = "Survival curves", subtitle = "Based on Kaplan-Meier estimates", - caption = "created with survminer", - font.title = c(16, "bold", "darkblue"), - font.subtitle = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen"))
Now allow risk.table
to be displayed.
Please compare basic plot with a risk table
-ggsurvplot(fit, data = lung, risk.table = TRUE)
with the plot where every possible text on a plot
and table
is specified
ggsurvplot(fit, data = lung, - title = "Survival curves", subtitle = "Based on Kaplan-Meier estimates", - caption = "created with survminer", - font.title = c(16, "bold", "darkblue"), - font.subtitle = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen"), - ########## risk table #########, - risk.table = TRUE, - risk.table.title = "Note the risk set sizes", - risk.table.subtitle = "and remember about censoring.", - risk.table.caption = "source code: website.com", - risk.table.height = 0.45)
Finally, allow ncens.plot
to be displayed.
Please compare basic plot with a risk table and a ncens
plot
ggsurvplot(fit, data = lung, risk.table = TRUE, ncensor.plot = TRUE)
with the full customization
-ggsurvplot(fit, data = lung, - title = "Survival curves", subtitle = "Based on Kaplan-Meier estimates", - caption = "created with survminer", - font.title = c(16, "bold", "darkblue"), - font.subtitle = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen"), - ########## risk table #########, - risk.table = TRUE, - risk.table.title = "Note the risk set sizes", - risk.table.subtitle = "and remember about censoring.", - risk.table.caption = "source code: website.com", - risk.table.height = 0.35, - ncensor.plot = TRUE, - ncensor.plot.title = "Number of censorings", - ncensor.plot.subtitle = "over the time.", - ncensor.plot.caption = "data available at data.com", - ncensor.plot.height = 0.35)
title
and subtitle
for the curve_plot
and risk.table.title
and risk.table.subtitle
for the table
/ ncens.plot.title
and ncens.plot.subtitle
for the ncens
.curve_plot
, for the table
and for the ncens
parts. This might change in the future.vignettes/Specifiying_weights_in_log-rank_comparisons.Rmd
- Specifiying_weights_in_log-rank_comparisons.Rmd
library("survminer")
--This vignette covers changes between versions 0.2.4 and 0.2.5 for specifiyng weights in the log-rank comparisons done in
-ggsurvplot()
.
As it is stated in the literature, the Log-rank test for comparing survival (estimates of survival curves) in 2 groups (\(A\) and \(B\)) is based on the below statistic
-\[LR = \frac{U^2}{V} \sim \chi(1),\]
-where \[U = \sum_{i=1}^{T}w_{t_i}(o_{t_i}^A-e_{t_i}^A), \ \ \ \ \ \ \ \ V = Var(U) = \sum_{i=1}^{T}(w_{t_i}^2\frac{n_{t_i}^An_{t_i}^Bo_{t_i}(n_{t_i}-o_{t_i})}{n_{t_i}^2(n_{t_i}-1)})\] and
-also remember about few notes
-\[e_{t_i}^A = n_{t_i}^A \frac{o_{t_i}}{n_{t_i}}, \ \ \ \ \ \ \ \ \ \ e_{t_i}^B = n_{t_i}^B \frac{o_{t_i}}{n_{t_i}},\] \[e_{t_i}^A + e_{t_i}^B = o_{t_i}^A + o_{t_i}^B\]
-that’s why we can substitute group \(A\) with \(B\) in \(U\) and receive same results.
-Regular Log-rank comparison uses \(w_{t_i} = 1\) but many modifications to that approach have been proposed. The most popular modifications, called weighted Log-rank tests, are available in ?survMisc::comp
n
Gehan and Breslow proposed to use \(w_{t_i} = n_{t_i}\) (this is also called generalized Wilcoxon),srqtN
Tharone and Ware proposed to use \(w_{t_i} = \sqrt{n_{t_i}}\),S1
Peto-Peto’s modified survival estimate \(w_{t_i} = S1({t_i}) = \prod_{i=1}^{T}(1-\frac{e_{t_i}}{n_{t_i}+1})\),S2
modified Peto-Peto (by Andersen) \(w_{t_i} = S2({t_i}) = \frac{S1({t_i})n_{t_i}}{n_{t_i}+1}\),FH
Fleming-Harrington \(w_{t_i} = S(t_i)^p(1 - S(t_i))^q\).--Watch out for
-FH
as I submitted an info on survMisc repository where I think their mathematical notation is misleading for Fleming-Harrington.
The regular Log-rank test is sensitive to detect differences in late survival times, where Gehan-Breslow and Tharone-Ware propositions might be used if one is interested in early differences in survival times. Peto-Peto modifications are also useful in early differences and are more robust (than Tharone-Whare or Gehan-Breslow) for situations where many observations are censored. The most flexible is Fleming-Harrington method for weights, where high p
indicates detecting early differences and high q
indicates detecting differences in late survival times. But there is always an issue on how to detect p
and q
.
--Remember that test selection should be performed at the research design level! Not after looking in the dataset.
-
After preparing a functionality for this GitHub’s issue Other tests than log-rank for testing survival curves and Log-rank test for trend we are now able to compute p-values for various Log-rank test in survminer package. Let as see below examples on executing all possible tests.
-ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE)
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "1")
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "n", pval.method.coord = c(5, 0.1), - pval.method.size = 3)
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "sqrtN", pval.method.coord = c(3, 0.1), - pval.method.size = 4)
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "S1", pval.method.coord = c(5, 0.1), - pval.method.size = 3)
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "S2", pval.method.coord = c(5, 0.1), - pval.method.size = 3)
ggsurvplot(fit, data = lung, pval = TRUE, pval.method = TRUE, - log.rank.weights = "FH_p=1_q=1", - pval.method.coord = c(5, 0.1), - pval.method.size = 4)
Gehan A. A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples. Biometrika 1965 Jun. 52(1/2):203-23.
Tarone RE, Ware J 1977 On Distribution-Free Tests for Equality of Survival Distributions. Biometrika;64(1):156-60.
Peto R, Peto J 1972 Asymptotically Efficient Rank Invariant Test Procedures. J Royal Statistical Society 135(2):186-207.
Fleming TR, Harrington DP, O’Sullivan M 1987 Supremum Versions of the Log-Rank and Generalized Wilcoxon Statistics. J American Statistical Association 82(397):312-20.
Billingsly P 1999 Convergence of Probability Measures. New York: John Wiley & Sons.
vignettes/ggforest-show-interactions-hazard-ratio.Rmd
- ggforest-show-interactions-hazard-ratio.Rmd
In general case it may be tricky to automatically extract interactions or variable transformations from model objects. A suggestion would be to create manually new variables that capture desired effects of interactions and add them to the model in an explicit way. This article describe an example of how to do this.
-res.cox <- coxph(Surv(time, status) ~ ph.karno * age, data=lung) -summary(res.cox, conf.int = FALSE)
Call:
-coxph(formula = Surv(time, status) ~ ph.karno * age, data = lung)
-
- n= 227, number of events= 164
- (1 observation deleted due to missingness)
-
- coef exp(coef) se(coef) z Pr(>|z|)
-ph.karno -0.1211782 0.8858761 0.0486092 -2.493 0.0127 *
-age -0.1206758 0.8863212 0.0610426 -1.977 0.0481 *
-ph.karno:age 0.0016586 1.0016600 0.0007525 2.204 0.0275 *
----
-Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-
-Concordance= 0.598 (se = 0.025 )
-Likelihood ratio test= 14.52 on 3 df, p=0.002
-Wald test = 13.42 on 3 df, p=0.004
-Score (logrank) test = 13.44 on 3 df, p=0.004
-Visualization of the hazard ratios using the function ggforest()
.
ggforest(res.cox, data = lung)
On the plot above, it can be seen that ggforest()
ignores the interaction term ph.karno:age
.
To fix this, a solution is to create manually the variable that handles the interaction:
-lung$ph.karno_age <- lung$ph.karno * lung$age
and now you can fit an additive model and the ggforest()
function will include it in the plot:
res.cox2 <- coxph(Surv(time, status) ~ ph.karno + age + ph.karno_age, data = lung) -summary(res.cox2 , conf.int = FALSE)
Call:
-coxph(formula = Surv(time, status) ~ ph.karno + age + ph.karno_age,
- data = lung)
-
- n= 227, number of events= 164
- (1 observation deleted due to missingness)
-
- coef exp(coef) se(coef) z Pr(>|z|)
-ph.karno -0.1211782 0.8858761 0.0486092 -2.493 0.0127 *
-age -0.1206758 0.8863212 0.0610426 -1.977 0.0481 *
-ph.karno_age 0.0016586 1.0016600 0.0007525 2.204 0.0275 *
----
-Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-
-Concordance= 0.598 (se = 0.025 )
-Likelihood ratio test= 14.52 on 3 df, p=0.002
-Wald test = 13.42 on 3 df, p=0.004
-Score (logrank) test = 13.44 on 3 df, p=0.004
-ggforest(res.cox2, data=lung)
The survminer R package provides functions for facilitating survival analysis and visualization.
-The main functions, in the package, are organized in different categories as follow.
-Survival Curves -ggsurvplot(): Draws survival curves with the ‘number at risk’ table, the cumulative number of events table and the cumulative number of censored subjects table.
arrange_ggsurvplots(): Arranges multiple ggsurvplots on the same page.
ggsurvevents(): Plots the distribution of event’s times.
surv_summary(): Summary of a survival curve. Compared to the default summary() function, surv_summary() creates a data frame containing a nice summary from survfit results.
surv_cutpoint(): Determines the optimal cutpoint for one or multiple continuous variables at once. Provides a value of a cutpoint that correspond to the most significant relation with survival.
pairwise_survdiff(): Multiple comparisons of survival curves. Calculate pairwise comparisons between group levels with corrections for multiple testing.
ggcoxzph(): Graphical test of proportional hazards. Displays a graph of the scaled Schoenfeld residuals, along with a smooth curve using ggplot2. Wrapper around plot.cox.zph().
ggcoxdiagnostics(): Displays diagnostics graphs presenting goodness of Cox Proportional Hazards Model fit.
ggcoxfunctional(): Displays graphs of continuous explanatory variable against martingale residuals of null cox proportional hazards model. It helps to properly choose the functional form of continuous variable in cox model.
ggforest(): Draws forest plot for CoxPH model.
ggcoxadjustedcurves(): Plots adjusted survival curves for coxph model.
--Find out more at https://rpkgs.datanovia.com/survminer/, and check out the documentation and usage examples of each of the functions in survminer package.
-
Install from CRAN as follow:
-install.packages("survminer")
Or, install the latest version from GitHub:
-if(!require(devtools)) install.packages("devtools") -devtools::install_github("kassambara/survminer", build_vignettes = FALSE)
Load survminer:
-library("survminer")
ggsurvplot(fit, data = lung)
Censor shape can be changed as follow:
-ggsurvplot(fit, data = lung, censor.shape="|", censor.size = 4)
ggsurvplot( - fit, - data = lung, - size = 1, # change line size - palette = - c("#E7B800", "#2E9FDF"),# custom color palettes - conf.int = TRUE, # Add confidence interval - pval = TRUE, # Add p-value - risk.table = TRUE, # Add risk table - risk.table.col = "strata",# Risk table color by groups - legend.labs = - c("Male", "Female"), # Change legend labels - risk.table.height = 0.25, # Useful to change when you have multiple groups - ggtheme = theme_bw() # Change ggplot2 theme -)
Note that, additional arguments are available to customize the main title, axis labels, the font style, axis limits, legends and the number at risk table.
-Focus on xlim
and break.time.by
parameters which do not change the calculations of estimates of survival surves. Also note risk.table.y.text.col = TRUE
and risk.table.y.text = FALSE
that provide bars instead of names in text annotations of the legend of risk table.
ggsurvplot( - fit, # survfit object with calculated statistics. - data = lung, # data used to fit survival curves. - risk.table = TRUE, # show risk table. - pval = TRUE, # show p-value of log-rank test. - conf.int = TRUE, # show confidence intervals for - # point estimates of survival curves. - xlim = c(0,500), # present narrower X axis, but not affect - # survival estimates. - xlab = "Time in days", # customize X axis label. - break.time.by = 100, # break X axis in time intervals by 500. - ggtheme = theme_light(), # customize plot and risk table with a theme. - risk.table.y.text.col = T, # colour risk table text annotations. - risk.table.y.text = FALSE # show bars instead of names in text annotations - # in legend of risk table -)
ggsurv <- ggsurvplot( - fit, # survfit object with calculated statistics. - data = lung, # data used to fit survival curves. - risk.table = TRUE, # show risk table. - pval = TRUE, # show p-value of log-rank test. - conf.int = TRUE, # show confidence intervals for - # point estimates of survival curves. - palette = c("#E7B800", "#2E9FDF"), - xlim = c(0,500), # present narrower X axis, but not affect - # survival estimates. - xlab = "Time in days", # customize X axis label. - break.time.by = 100, # break X axis in time intervals by 500. - ggtheme = theme_light(), # customize plot and risk table with a theme. - risk.table.y.text.col = T,# colour risk table text annotations. - risk.table.height = 0.25, # the height of the risk table - risk.table.y.text = FALSE,# show bars instead of names in text annotations - # in legend of risk table. - ncensor.plot = TRUE, # plot the number of censored subjects at time t - ncensor.plot.height = 0.25, - conf.int.style = "step", # customize style of confidence intervals - surv.median.line = "hv", # add the median survival pointer. - legend.labs = - c("Male", "Female") # change legend labels. - ) -ggsurv
Helper function to customize plot labels:
-customize_labels <- function (p, font.title = NULL, - font.subtitle = NULL, font.caption = NULL, - font.x = NULL, font.y = NULL, font.xtickslab = NULL, font.ytickslab = NULL) -{ - original.p <- p - if(is.ggplot(original.p)) list.plots <- list(original.p) - else if(is.list(original.p)) list.plots <- original.p - else stop("Can't handle an object of class ", class (original.p)) - .set_font <- function(font){ - font <- ggpubr:::.parse_font(font) - ggtext::element_markdown (size = font$size, face = font$face, colour = font$color) - } - for(i in 1:length(list.plots)){ - p <- list.plots[[i]] - if(is.ggplot(p)){ - if (!is.null(font.title)) p <- p + theme(plot.title = .set_font(font.title)) - if (!is.null(font.subtitle)) p <- p + theme(plot.subtitle = .set_font(font.subtitle)) - if (!is.null(font.caption)) p <- p + theme(plot.caption = .set_font(font.caption)) - if (!is.null(font.x)) p <- p + theme(axis.title.x = .set_font(font.x)) - if (!is.null(font.y)) p <- p + theme(axis.title.y = .set_font(font.y)) - if (!is.null(font.xtickslab)) p <- p + theme(axis.text.x = .set_font(font.xtickslab)) - if (!is.null(font.ytickslab)) p <- p + theme(axis.text.y = .set_font(font.ytickslab)) - list.plots[[i]] <- p - } - } - if(is.ggplot(original.p)) list.plots[[1]] - else list.plots -}
Customized plot labels:
-# Changing Labels -# %%%%%%%%%%%%%%%%%%%%%%%%%% -# Labels for Survival Curves (plot) -ggsurv$plot <- ggsurv$plot + labs( - title = "Survival curves", - subtitle = "Based on Kaplan-Meier estimates", - caption = "created with survminer" - ) - -# Labels for Risk Table -ggsurv$table <- ggsurv$table + labs( - title = "Note the risk set sizes", - subtitle = "and remember about censoring.", - caption = "source code: website.com" - ) - -# Labels for ncensor plot -ggsurv$ncensor.plot <- ggsurv$ncensor.plot + labs( - title = "Number of censorings", - subtitle = "over the time.", - caption = "source code: website.com" - ) - -# Changing the font size, style and color -# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -# Applying the same font style to all the components of ggsurv: -# survival curves, risk table and censor part - -ggsurv <- customize_labels( - ggsurv, - font.title = c(16, "bold", "darkblue"), - font.subtitle = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.xtickslab = c(12, "plain", "darkgreen") -) - -ggsurv
M. Kosiński. R-ADDICT January 2017. Comparing (Fancy) Survival Curves with Weighted Log-rank Tests
M. Kosiński. R-ADDICT January 2017. When You Went too Far with Survival Plots During the survminer 1st Anniversary
A. Kassambara. STHDA December 2016. Survival Analysis Basics: Curves and Logrank Tests
A. Kassambara. STHDA December 2016. Cox Proportional Hazards Model
A. Kassambara. STHDA December 2016. Cox Model Assumptions
M. Kosiński. R-ADDICT November 2016. Determine optimal cutpoints for numerical variables in survival plots
M. Kosiński. R-ADDICT May 2016. Survival plots have never been so informative
A. Kassambara. STHDA January 2016. survminer R package: Survival Data Analysis and Visualization.
NEWS.md
- element_text()
issues a warning when vectorized arguments are provided, as in colour = c(“red”, “green”, “blue”). This is a breaking change affecting the function ggsurvtable()
. To fix this, the function ggtext::element_markdown()
is now used in place of element_text()
to handle vectorized colors (issue #455 fixed by pull #503).log.rank.weights = "n"
is specified in the function ggsurvplot()
(#453)ggsurvplot()
examples, the function gridExtra::rbind.gtable()
is now replaced by gridExtra::gtable_rbind()
(@jan-imbi, pull #493).ggforest()
(pull 485).survfit(res.cox)
returned an object of class survfit.cox. The class has been changed to survfitcox
in the current survival package version. The survminer package has been now updated to take this change into account (@edvbb, #441).Fixes to adapt to dplyr 1.0.0 (@romainfrancois, #460):
-conf.int
in the ggsurvplot()
function. To fix this issue, Now, NAs are removed by default when drawing the confidence interval (#443 and #315).cmprsk
is no longer needed for survminer installation. The package has been moved from Imports to Suggests. It’s only used in documentations (@massimofagg, #394.ggflexsurvplot()
, the grouping variable can be factor or character vector (@andersbergren , #393
-anova()
as requested (@pbiecek, #391
-When a factor variable name is the same as one of its level, ggsurvplot()
failed (@KohSoonho, #387). Fixed now.
ggsurvplot()
can now create correctly faceted survival curves (@uraniborg, #254, @BingxinS, #363)
A typo fixed in the formula for weightened log-rank test (@MarcinKosinski, #336.
surv_summary()
can now handle the output of survfit(cox.model, newdata)
when the option conf.type = "none"
is specified by users (@HeidiSeibold, #335.
ggadjustedcurves()
has now flipped labels for conditional
/marginal
to mach names from ’Adjusted Survival Curves’ by Terry Therneau, Cynthia Crowson, Elizabeth Atkinson (2015) (@pbiecek, #335.
ggsurvplot()
can be used to plot survreg model (@HeidiSeibold, #276, #325 ).ggforest()
simply returns a ggplot instead of drawing automatically the plot (@grvsinghal, #267).axes.offset
argument is also applied to risk table (@dmartinffm, #243).ggsurvplot
to powerpoint document using ReporteRs even if there is no risk table (@DrRZ, #314).size
added in ggadjustedcurves()
to change the curve size (@MaximilianTscharre, #267).ggtheme
is supported when combining a list of survfit objects in ggsurvplot()
(@PhonePong, #278).New function ggflexsurvplot()
to create ggplot2-based graphs for flexible survival models.
The function ggadjustedcurves()
handles now argument method
that defines how adjusted curves shall be calculated. With method='conditional'|'marginal'
subpopulations are balanced with respect to variables present in the model formula. With method='single'|'average'
the curve represents just the expected survival curves.
ggcoxadjustedcurves()
is replaced by ggadjustedcurves()
(#229).The grouping variable to the ggadjustedcurves()
function is now passed as a name (character) of grouping variable not as a vector with values of grouping variable.
New argument font.family
in ggsurvtable()
to change the font family in the survival tables - such as risk, cummulative events and censoring tables. For example font.family = “Courier New” (@Swechhya, #245).
Now, in ggsurvplot()
the data argument should be strictly provided (@dnzmarcio, #235)
ggforest()
no longer tries to bolt a table full of text to the coefficient plot (@mmoisse, #241), instead the annotations are done via ggplot2::annotate, see example at: @fabian-s, #264
-New argument test.for.trend
added in ggsurvplot()
to perform a log-rank test for trend. logical value. Default is FALSE. If TRUE, returns the test for trend p-values. Tests for trend are designed to detect ordered differences in survival curves. That is, for at least one group. The test for trend can be only performed when the number of groups is > 2 (#188).
New argument add.all
added now in ggsurvplot()
to add he survival curves of (all) pooled patients onto the main survival plot stratified by grouping variables. Alias of the ggsurvplot_add_all()
function (#194).
New argument combine = TRUE
is now available in the ggsurvplot()
function to combine a list of survfit objects on the same plot. Alias of the ggsurvplot_combine() function (#195).
The standard convention of ggplot2 is to have the axes offset from the origin. This can be annoying with Kaplan-Meier plots. New argument axes.offset
added non in ggsurvplot()
. logical value. Default is TRUE. If FALSE, set the plot axes to start at the origin (c(0,0)) (#196).
The function ggsurvplot()
can take a list of survfit objects and produces a list of ggsurvplots (#204).
New argument facet.by
added now in ggsurvplot()
to draw multi-panel survival curves of a data set grouped by one or two variables. Alias of the ggsurvplot_facet()
function (#205).
New argument group.by
added now in ggsurvplot()
to create survival curves of grouped data sets. Alias of the ggsurvplot_group_by()
function.
In ggsurvplot()
, one can specify pval = TRUE/FALSE as a logical value. Now, it’s also possible to specify the argument pval
as a numeric value (e.g.: pval = 0.002), that will be passed to the plot, so that user can pass any custom p-value to the final plot (@MarcinKosinski, #189) or one can specify it as a character string (e.g.: pval = “p < 0001”) (@MarcinKosinski, #193).
New argument xscale
in ggsurvplot()
: numeric or character value specifying x-axis scale.
New arguments censor.shape
and censor.size
to change the shape and the shape of censors (#186 & #187).
New argument conf.int.alpha
added in ggsurvplot()
. Numeric value specifying fill color transparency. Value should be in [0, 1], where 0 is full transparency and 1 is no transparency.
New function surv_group_by()
added to create a grouped data set for survival analysis.
New function ggsurvplot_df()
added. An extension to ggsurvplot() to plot survival curves from any data frame containing the summary of survival curves as returned the surv_summary() function. Might be useful for a user who wants to use ggsurvplot for visualizing survival curves computed by another method than the standard survfit.formula function. In this case, the user has just to provide the data frame containing the summary of the survival analysis.
New function surv_median()
added to easily extract median survivals from one or a list of survfit objects (#207).
New function surv_pvalue
() added to compute p-value from survfit objects or parse it when provided by the user. Survival curves are compared using the log-rank test (default). Other methods can be specified using the argument method.
New function surv_fit
() added to handle complex situation when computing survival curves (Read more in the doc: ?surv_fit). Wrapper arround the standard survfit
() [survival] function to create survival curves. Compared to the standard survfit() function, it supports also:
ggforest()
function has changed a lot. Now presents much more statistics for each level of each variable (extracted with broom::tidy
) and also some statistics for the coxph
model, like AIC, p.value, concordance (extracted with broom::glance
) (#178)Now, ggcompetingrisks()
supports the conf.int
argument. If conf.int=TRUE
and fit
is an object of class cuminc
then confidence intervals are plotted with geom_ribbon
.
Now, ggsurvplot()
supports the survfit()
outputs when used with the argument start.time
.
Now, the default behaviour of ggsurvplot()
is to round the number at risk using the option digits = 0
(#214).
pairwise_survdiff()
has been improved to handle a formula with multiple variables (#213).
The argument color
are updated allowing to assign the same color for same groups accross facets (#99 & #185).
For example, in the following script, survival curves are colored by the grouping variable sex
in all facets:
library(survminer) -library(survival) -fit <- survfit( Surv(time, status) ~ sex + rx + adhere, - data = colon ) -ggsurv <- ggsurvplot(fit, data = colon, - color = "sex", - legend.title = "Sex", - palette = "jco") -ggsurv$plot + facet_grid(rx ~ adhere)
Now, the function pairwise_survdiff()
checks whether the grouping variable is a factor. If this is not the case, the grouping variable is automatically converted into a factor.
ggsurvplot()
: Now, log scale is used for x-axis when plotting the complementary log−log function (argument `fun = “cloglog”) (#171).
Now, the argument palette
in ggsurvplot()
ccan be also a numeric vector of length(strata); in this case a basic color palette is created using the function grDevices::palette()
.
The %+%
function in survminer
has been replaced by %++%
to avoid breaking the ggplot2::%+%
function behavior when using survminer (#199 and #200).
New argument fun
added in ggcoxadjustedcurves()
(@meganli, #202).
The function theme_classic2()
removed.
Columns/Rows are now correctly labeled in pairwise_survdiff
() display (@mriffle, #212).
Now, the pairwise_survdiff()
function works when the data contain NAs (@emilelatour , #184).
Now, ggsurvplot()
fully supports different methods, in the survMisc package, for comparing survival curves (#191).
ggcoxdiagnostics()
function and the vignette file Informative_Survival_Plots.Rmd
have been updated so that survminer
can pass CRAN check under R-oldrelease.BMT
added for competing risk analysis.BRCAOV.survInfo
added, used in vignette filespalette
argument works in `ggcoxadjustedcurves() (#174)ggsurvplot()
works when the fun
argument is an arbitrary function (#176).Additional data
argument added to the ggsurvplot()
function (@kassambara, #142). Now, it’s recommended to pass to the function, the data used to fit survival curves. This will avoid the error generated when trying to use the ggsurvplot()
function inside another functions (@zzawadz, #125).
New argument risk.table.pos
, for placing risk table inside survival curves (#69). Allowed options are one of c(“out”, “in”) indicating ‘outside’ or ‘inside’ the main plot, respectively. Default value is “out”.
New arguments tables.height, tables.y.text, tables.theme, tables.col
: for customizing tables under the main survival plot: (#156).
New arguments cumevents
and cumcensor
: logical value for displaying the cumulative number of events table (#117) and the cumulative number of censored subject (#155), respectively.
Now, ggsurvplot()
can display both the number at risk and the cumulative number of censored in the same table using the option risk.table = 'nrisk_cumcenor'
(#96). It’s also possible to display the number at risk and the cumulative number of events using the option risk.table = 'nrisk_cumevents'
.
New arguments pval.method
and log.rank.weights
: New possibilities to compare survival curves. Functionality based on survMisc::comp
.
New arguments break.x.by
and break.y.by
, numeric value controlling x and y axis breaks, respectively.
Now, ggsurvplot()
returns an object of class ggsurvplot which is list containing the following components (#158):
New function theme_survminer()
to change easily the graphical parameters of plots generated with survminer (#151). A theme similar to theme_classic() with large font size. Used as default theme in survminer functions.
New function theme_cleantable()
to draw a clean risk table and cumulative number of events table. Remove axis lines, x axis ticks and title (#117 & #156).
# Fit survival curves -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Survival curves -require("survminer") -ggsurvplot(fit, data = lung, risk.table = TRUE, - tables.theme = theme_cleantable() - )
+.ggsurv()
to add ggplot components - theme()
, labs()
- to an object of class ggsurv, which is a list of ggplots. (#151). For example:# Fit survival curves -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Basic survival curves -require("survminer") -p <- ggsurvplot(fit, data = lung, risk.table = TRUE) -p - -# Customizing the plots -p %+% theme_survminer( - font.main = c(16, "bold", "darkblue"), - font.submain = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen") -)
New function arrange_ggsurvplots()
to arrange multiple ggsurvplots on the same page (#66).
New function ggsurvevents()
to calculate and plot the distribution for events (both status = 0 and status = 1); with type
parameter one can plot cumulative distribution of locally smooth density; with normalised, distributions are normalised. This function helps to notice when censorings are more common (@pbiecek, #116).
New function ggcoxadjustedcurves()
to plot adjusted survival curves for Cox proportional hazards model (@pbiecek, #133 & @markdanese, #67).
New function ggforest()
for drawing forest plot for the Cox model.
New function pairwise_survdiff()
for multiple comparisons of survival Curves (#97).
New function ggcompetingrisks()
to plot the cumulative incidence curves for competing risks (@pbiecek, #168.
New heper functions ggrisktable()
, ggcumevents()
, ggcumcensor()
. Normally, users don’t need to use these function directly. Internally used by the function ggsurvplot()
.
ggrisktable()
for plotting number of subjects at risk by time. (#154).ggcumevents()
for plotting the cumulative number of events table (#117).ggcumcensor()
for plotting the cumulative number of censored subjects table (#155).New argument sline
in the ggcoxdiagnostics()
function for adding loess smoothed trend on the residual plots. This will make it easier to spot some problems with residuals (like quadratic relation). (@pbiecek, #119).
The design of ggcoxfunctional()
has been changed to be consistent with the other functions in the survminer package. Now, ggcoxfunctional()
works with coxph objects not formulas. The arguments formula is now deprecated (@pbiecek, #115).
In the ggcoxdiagnostics()
function, it’s now possible to plot Time in the OX axis (@pbiecek, #124). This is convenient for some residuals like Schoenfeld. The linear.predictions
parameter has been replaced with ox.scale = c("linear.predictions", "time", "observation.id")
.
New argument tables.height
in ggsurvplot()
to apply the same height to all the tables under the main survival plots (#157).
It is possible to specify title
and caption
for ggcoxfunctional
(@MarcinKosinski, #138) (font.main
was removed as it was unused.)
It is possible to specify title
, subtitle
and caption
for ggcoxdiagnostics
(@MarcinKosinski, #139) and fonts
for them.
It is possible to specify global caption
for ggcoxzph
(@MarcinKosinski, #140).
In ggsurvplot()
, more information, about color palettes, have been added in the details section of the documentation (#100).
The R package maxstat
doesn’t support very well an object of class tbl_df
. To fix this issue, now, in the surv_cutpoint()
function, the input data is systematically transformed into a standard data.frame format (@MarcinKosinski, #104).
It’s now possible to print the output of the survminer packages in a powerpoint created with the ReporteRs package. You should use the argument newpage = FALSE in the print()
function when printing the output in the powerpoint. Thanks to (@abossenbroek, #110) and (@zzawadz, #111). For instance:
require(survival) -require(ReporteRs) -require(survminer) - -fit <- survfit(Surv(time, status) ~ rx + adhere, data =colon) -survplot <- ggsurvplot(fit, pval = TRUE, - break.time.by = 400, - risk.table = TRUE, - risk.table.col = "strata", - risk.table.height = 0.5, # Useful when you have multiple groups - palette = "Dark2") - - -require(ReporteRs) -doc = pptx(title = "Survival plots") -doc = addSlide(doc, slide.layout = "Title and Content") -doc = addTitle(doc, "First try") -doc = addPlot(doc, function() print(survplot, newpage = FALSE), vector.graphic = TRUE) -writeDoc(doc, "test.pptx")
ggcoxdiagnostics()
, the option ncol = 1
is removed from the function facet_wrap()
. By default, ncol = NULL
. In this case, the number of columns and rows in the plot panels is defined automatically based on the number of covariates included in the cox model.Now, risk table align with survival plots when legend = “right” (@jonlehrer, #102).
Now, ggcoxzph()
works for univariate Cox analysis (#103).
Now, ggcoxdiagnostics()
works properly for schoenfeld residuals (@pbiecek, #119).
Now, ggsurvplot()
works properly in the situation where strata()
is included in the cox formula (#109).
surv_summary()
(v0.2.3) generated an error when the name of the variable used in survfit()
can be found multiple times in the levels of the same variable. For example, variable = therapy; levels(therapy) –> “therapy” and “hormone therapy” (#86). This has been now fixed.
To extract variable names used in survival::survfit()
, the R code strsplit(strata, "=|,\\s+", perl=TRUE)
was used in the surv_summary()
function [survminer v0.2.3]. The splitting was done at any “=” symbol in the string, causing an error when special characters (=, <=, >=) are used for the levels of a categorical variable (#91). This has been now fixed.
Now, ggsurvplot()
draws correctly the risk.table (#93).
surv_summary()
for creating data frame containing a nice summary of a survival curve (#64).ggsurvplot()
by one or more factors (#64):# Fit complexe survival curves
-require("survival")
-fit3 <- survfit( Surv(time, status) ~ sex + rx + adhere,
- data = colon )
-
-# Visualize by faceting
-# Plots are survival curves by sex faceted by rx and adhere factors.
-require("survminer")
-ggsurv$plot +theme_bw() + facet_grid(rx ~ adhere)
-ggsurvplot()
can be used to plot cox model (#67).surv_cutpoint()
: Determine the optimal cutpoint for each variable using ‘maxstat’. Methods defined for surv_cutpoint object are summary(), print() and plot().surv_categorize()
: Divide each variable values based on the cutpoint returned by surv_cutpoint()
(#41).ggsurvplot()
. A logical value. If TRUE, the number of censored subjects at time t is plotted. Default is FALSE (#18).ggsurvplot()
for changing the style of confidence interval bands.ggsurvplot()
plots a stepped confidence interval when conf.int = TRUE (#65).ggsurvplot()
updated for compatibility with the future version of ggplot2 (v2.2.0) (#68)fun
. For example, if fun = “event”, then ylab will be “Cumulative event”.ggsurvplot()
, linetypes can now be adjusted by variables used to fit survival curves (#46)ggsurvplot()
, the argument risk.table can be either a logical value (TRUE|FALSE) or a string (“absolute”, “percentage”). If risk.table = “absolute”, ggsurvplot()
displays the absolute number of subjects at risk. If risk.table = “percentage”, the percentage at risk is displayed. Use “abs_pct” to show both the absolute number and the percentage of subjects at risk (#70).ggsurvplot()
: character vector for drawing a horizontal/vertical line at median (50%) survival. Allowed values include one of c(“none”, “hv”, “h”, “v”). v: vertical, h:horizontal (#61).ggcoxdiagnostics()
can now handle a multivariate Cox model (#62)ggcoxfunctional()
now displays graphs of continuous variable against martingale residuals of null cox proportional hazards model (#63).ggsurvplot()
to report the right p-value on the subset of the data and not on the whole data sets (@jseoane, #71).ggcoxzph()
can now produce plots only for specified subset of varibles (@MarcinKosinski, #75)ggcoxdiagnostics
function that plots diagnostic graphs for Cox Proportional Hazards model (@MarcinKosinski, #16).Survival plots have never been so informative
(@MarcinKosinski, #39)ggsurvplot()
documentation. (@ViniciusBRodrigues, #43)New ggcoxzph
function that displays a graph of the scaled Schoenfeld residuals, along with a smooth curve using ‘ggplot2’. Wrapper around \link{plot.cox.zph}. (@MarcinKosinski, #13)
New ggcoxfunctional
function that displays graphs of continuous explanatory variable against martingale residuals of null cox proportional hazards model, for each term in of the right side of input formula. This might help to properly choose the functional form of continuous variable in cox model, since fitted lines with lowess
function should be linear to satisfy cox proportional hazards model assumptions. (@MarcinKosinski, #14)
New function theme_classic2
: ggplot2 classic theme with axis line. This function replaces ggplot2::theme_classic, which does no longer display axis lines (since ggplot2 v2.1.0)
risk.table.y.text.col
is now TRUE.ggsurvplot
. logical argument. Default is TRUE. If FALSE, risk table y axis tick labels will be hidden (@MarcinKosinski, #28).New arguments in ggsurvplot for changing font style, size and color of main title, axis labels, axis tick labels and legend labels: font.main, font.x, font.y, font.tickslab, font.legend.
New arguments risk.table.title, risk.table.fontsize in ggsurvplot
New argument risk.table.y.text.col: logical value. Default value is FALSE. If TRUE, risk table tick labels will be colored by strata (@MarcinKosinski, #8).
print.ggsurvplot()
function added: S3 method for class ‘ggsurvplot’.
ggsurvplot returns an object of class ggsurvplot which is list containing two ggplot objects:
-It’s now possible to customize the output survival plot and the risk table returned by ggsurvplot, and to print again the final plot. (@MarcinKosinski, #2):
# Fit survival curves
-require("survival")
-fit<- survfit(Surv(time, status) ~ sex, data = lung)
-
-# visualize
-require(survminer)
-ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
- risk.table = TRUE)
-
-# Customize the output and then print
-res <- ggsurvplot(fit, pval = TRUE, conf.int = TRUE,
- risk.table = TRUE)
-res$table <- res$table + theme(axis.line = element_blank())
-res$plot <- res$plot + labs(title = "Survival Curves")
-print(res)
-ggtheme now affects risk.table (@MarcinKosinski, #1)
xlim changed to cartesian coordinates mode (@MarcinKosinski, #4). The Cartesian coordinate system is the most common type of coordinate system. It will zoom the plot (like you’re looking at it with a magnifying glass), without clipping the data.
Risk table and survival curves have now the same color and the same order
Plot width is no longer too small when legend position = “left” (@MarcinKosinski, #7).
Bone marrow transplant data from L Scrucca et aL., Bone Marrow - Transplantation (2007). Data from 35 patients with acute leukaemia who - underwent HSCT. Used for competing risk analysis.
-data("BMT")- - -
A data frame with 35 rows and 3 columns.
-- dis: disease; 0 = ALL; 1 = AML -- ftime: follow-up time -- status: 0 = censored (survival); 1 = Transplant-related mortality; 2 = relapse -- -
Scrucca L, Santucci A, Aversa F. Competing risk analysis using R: - an easy guide for clinicians. Bone Marrow Transplant. 2007 Aug;40(4):381-7.
- --data(BMT) -if(require("cmprsk")){ - -# Data preparaion -#+++++++++++++++++++++ -# Label diseases -BMT$dis <- factor(BMT$dis, levels = c(0,1), - labels = c("ALL", "AML")) -# Label status -BMT$status <- factor(BMT$status, levels = c(0,1,2), - labels = c("Censored","Mortality","Relapse")) - -# Cumulative Incidence Function -# ++++++++++++++++++++++++++ -fit <- cmprsk::cuminc( - ftime = BMT$ftime, # Failure time variable - fstatus = BMT$status, # Codes for different causes of failure - group = BMT$dis # Estimates will calculated within groups - ) - -# Visualize -# +++++++++++++++++++++++ -ggcompetingrisks(fit) -ggcompetingrisks(fit, multiple_panels = FALSE, - legend = "right") - -}#>#>-
R/BRCAOV.survInfo.R
- BRCAOV.survInfo.Rd
Breat and Ovarian cancers survival information from the - RTCGA.clinical R/Bioconductor package.http://rtcga.github.io/RTCGA/.
-data("BRCAOV.survInfo")- - -
A data frame with 1674 rows and 4 columns.
- - times: follow-up time; - - bcr_patient_barcode: Patient bar code; - - patient.vital_status = survival status. 0 = alive, 1 = dead; - - admin.disease_code: disease code. brca = breast cancer, ov = ovarian - cancer. -- -
From the RTCGA.clinical R/Bioconductor package. The data is generated as follow:
--# Installing RTCGA.clinical -source("https://bioconductor.org/biocLite.R") -biocLite("RTCGA.clinical") - -# Generating the BRCAOV survival information -library(RTCGA.clinical) -survivalTCGA(BRCA.clinical, OV.clinical, -extract.cols = "admin.disease_code") -> BRCAOV.survInfo -- - -
-data(BRCAOV.survInfo) -library(survival) -fit <- survfit(Surv(times, patient.vital_status) ~ admin.disease_code, - data = BRCAOV.survInfo) -ggsurvplot(fit, data = BRCAOV.survInfo, risk.table = TRUE)
Allows to add ggplot components - theme(), labs(), ... - to an - object of class ggsurv, which is a list of ggplots.
-# S3 method for ggsurv -+(e1, e2) - -e1 %++% e2- -
e1 | -an object of class ggsurv. |
-
---|---|
e2 | -a plot component such as theme and labs. |
-
-# Fit survival curves -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Basic survival curves -p <- ggsurvplot(fit, data = lung, risk.table = TRUE, - main = "Survival curve", - submain = "Based on Kaplan-Meier estimates", - caption = "created with survminer" - ) -p-# Customizing the plots -p + theme_survminer( - font.main = c(16, "bold", "darkblue"), - font.submain = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen") -)
Arranging multiple ggsurvplots on the same page.
-arrange_ggsurvplots( - x, - print = TRUE, - title = NA, - ncol = 2, - nrow = 1, - surv.plot.height = NULL, - risk.table.height = NULL, - ncensor.plot.height = NULL, - ... -)- -
x | -a list of ggsurvplots. |
-
---|---|
logical value. If TRUE, the arranged plots are displayed. |
- |
title | -character vector specifying page title. Default is NA. |
-
ncol, nrow | -the number of columns and rows, respectively. |
-
surv.plot.height | -the height of the survival plot on the grid. Default
-is 0.75. Ignored when risk.table = FALSE. |
-
risk.table.height | -the height of the risk table on the grid. Increase -the value when you have many strata. Default is 0.25. Ignored when -risk.table = FALSE. |
-
ncensor.plot.height | -The height of the censor plot. Used when
- |
-
... | -not used |
-
returns an invisible object of class arrangelist (see - marrangeGrob), which can be saved into a pdf file using - the function ggsave.
- ---# Fit survival curves -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# List of ggsurvplots -require("survminer") -splots <- list() -splots[[1]] <- ggsurvplot(fit, data = lung, risk.table = TRUE, ggtheme = theme_minimal()) -splots[[2]] <- ggsurvplot(fit, data = lung, risk.table = TRUE, ggtheme = theme_grey()) - -# Arrange multiple ggsurvplots and print the output -arrange_ggsurvplots(splots, print = TRUE, - ncol = 2, nrow = 1, risk.table.height = 0.4)-if (FALSE) { -# Arrange and save into pdf file -res <- arrange_ggsurvplots(splots, print = FALSE) -ggsave("myfile.pdf", res) -}
R/ggadjustedcurves.R
- ggadjustedcurves.Rd
The function surv_adjustedcurves()
calculates while the function ggadjustedcurves()
plots adjusted survival curves for the coxph
model.
-The main idea behind this function is to present expected survival curves calculated based on Cox model separately for subpopulations. The very detailed description and interesting discussion of adjusted curves is presented in 'Adjusted Survival Curves' by Terry Therneau, Cynthia Crowson, Elizabeth Atkinson (2015) https://cran.r-project.org/web/packages/survival/vignettes/adjcurve.pdf
.
-Many approaches are discussed in this article. Currently four approaches (two unbalanced, one conditional and one marginal) are implemented in the ggadjustedcurves()
function. See the section Details.
ggadjustedcurves( - fit, - variable = NULL, - data = NULL, - reference = NULL, - method = "conditional", - fun = NULL, - palette = "hue", - ylab = "Survival rate", - size = 1, - ggtheme = theme_survminer(), - ... -) - -surv_adjustedcurves( - fit, - variable = NULL, - data = NULL, - reference = NULL, - method = "conditional", - size = 1, - ... -)- -
fit | -an object of class coxph.object - created with coxph function. |
-
---|---|
variable | -a character, name of the grouping variable to be plotted. If not supplied then it will be extracted from the model formula from the |
-
data | -a dataset for predictions. If not supplied then data will be extracted from the |
-
reference | -a dataset for reference population, to which dependent variables should be balanced. If not specified, then the |
-
method | -a character, describes how the expected survival curves shall be calculated. Possible options: -'single' (average for population), 'average' (averages for subpopulations), 'marginal', 'conditional' (averages for subpopulations after rebalancing). See the Details section for further information. |
-
fun | -an arbitrary function defining a transformation of the survival -curve. Often used transformations can be specified with a character -argument: "event" plots cumulative events (f(y) = 1-y), "cumhaz" plots the -cumulative hazard function (f(y) = -log(y)), and "pct" for survival -probability in percentage. |
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
ylab | -a label for oy axis. |
-
size | -the curve size. |
-
ggtheme | -function, ggplot2 theme name. Allowed values include ggplot2 official themes: see |
-
... | -further arguments passed to the function |
-
Returns an object of class gg
.
Currently four approaches are implemented in the ggadjustedcurves()
function.
For method = "single"
a single survival curve is calculated and plotted. The curve presents an expected survival calculated for population data
calculated based on the Cox model fit
.
For method = "average"
a separate survival curve is plotted for each level of a variable listed as variable
. If this argument is not specified, then it will be extracted from the strata
component of fit
argument. Each curve presents an expected survival calculated for subpopulation from data
based on a Cox model fit
. Note that in this method subpopulations are NOT balanced.
For method = "marginal"
a survival curve is plotted for each level of a grouping variable selected by variable
argument. If this argument is not specified, then it will be extracted from the strata
component of fit
object. Subpopulations are balanced with respect to variables in the fit
formula to keep distributions similar to these in the reference
population. If no reference population is specified, then the whole data
is used as a reference population instead. The balancing is performed in a following way: (1) for each subpopulation a logistic regression model is created to model the odds of being in the subpopulation against the reference population given the other variables listed in a fit
object, (2) reverse probabilities of belonging to a specified subpopulation are used as weights in the Cox model, (3) the Cox model is refitted with weights taken into account, (4) expected survival curves are calculated for each subpopulation based on a refitted weighted model.
For method = "conditional"
a separate survival curve is plotted for each level of a grouping variable selected by variable
argument. If this argument is not specified, then it will be extracted from the strata
component of fit
object. Subpopulations are balanced in a following way: (1) the data is replicated as many times as many subpopulations are considered (say k), (2) for each row in original data a set of k copies are created and for every copy a different value of a grouping variable is assigned, this will create a new dataset balanced in terms of grouping variables, (3) expected survival is calculated for each subpopulation based on the new artificial dataset. Here the model fit
is not refitted.
Note that surv_adjustedcurves
function calculates survival curves and based on this function one can calculate median survival.
--library(survival) -fit2 <- coxph( Surv(stop, event) ~ size, data = bladder ) -# single curve -ggadjustedcurves(fit2, data = bladder)curve <- surv_adjustedcurves(fit2, data = bladder) - -fit2 <- coxph( Surv(stop, event) ~ size + strata(rx), data = bladder ) -# average in groups -ggadjustedcurves(fit2, data = bladder, method = "average", variable = "rx")curve <- surv_adjustedcurves(fit2, data = bladder, method = "average", variable = "rx") - -# conditional balancing in groups -ggadjustedcurves(fit2, data = bladder, method = "marginal", variable = "rx")curve <- surv_adjustedcurves(fit2, data = bladder, method = "marginal", variable = "rx") - -# selected reference population -ggadjustedcurves(fit2, data = bladder, method = "marginal", variable = "rx", - reference = bladder[bladder$rx == "1",])-# conditional balancing in groups -ggadjustedcurves(fit2, data = bladder, method = "conditional", variable = "rx")curve <- surv_adjustedcurves(fit2, data = bladder, method = "conditional", variable = "rx") - -if (FALSE) { -# this will take some time -fdata <- flchain[flchain$futime >=7,] -fdata$age2 <- cut(fdata$age, c(0,54, 59,64, 69,74,79, 89, 110), - labels = c(paste(c(50,55,60,65,70,75,80), - c(54,59,64,69,74,79,89), sep='-'), "90+")) -fdata$group <- factor(1+ 1*(fdata$flc.grp >7) + 1*(fdata$flc.grp >9), - levels=1:3, - labels=c("FLC < 3.38", "3.38 - 4.71", "FLC > 4.71")) -# single curve -fit <- coxph( Surv(futime, death) ~ age*sex, data = fdata) -ggadjustedcurves(fit, data = fdata, method = "single") - -# average in groups -fit <- coxph( Surv(futime, death) ~ age*sex + strata(group), data = fdata) -ggadjustedcurves(fit, data = fdata, method = "average") - -# conditional balancing in groups -ggadjustedcurves(fit, data = fdata, method = "conditional") - -# marginal balancing in groups -ggadjustedcurves(fit, data = fdata, method = "marginal", reference = fdata) -}
R/ggcompetingrisks.R
- ggcompetingrisks.Rd
This function plots Cumulative Incidence Curves. For cuminc
objects it's a ggplot2
version of plot.cuminc
.
-For survfitms
objects a different geometry is used, as suggested by @teigentler
.
ggcompetingrisks( - fit, - gnames = NULL, - gsep = " ", - multiple_panels = TRUE, - ggtheme = theme_survminer(), - coef = 1.96, - conf.int = FALSE, - ... -)- -
fit | -an object of a class |
-
---|---|
gnames | -a vector with group names. If not supplied then will be extracted from |
-
gsep | -a separator that extracts group names and event names from |
-
multiple_panels | -if |
-
ggtheme | -function, |
-
coef | -see |
-
conf.int | -if |
-
... | -further arguments passed to the function |
-
Returns an object of class gg
.
-if (FALSE) { -if(require("cmprsk")){ - -set.seed(2) -ss <- rexp(100) -gg <- factor(sample(1:3,100,replace=TRUE),1:3,c('BRCA','LUNG','OV')) -cc <- factor(sample(0:2,100,replace=TRUE),0:2,c('no event', 'death', 'progression')) -strt <- sample(1:2,100,replace=TRUE) - -# handles cuminc objects -print(fit <- cmprsk::cuminc(ss,cc,gg,strt)) -ggcompetingrisks(fit) -ggcompetingrisks(fit, multiple_panels = FALSE) -ggcompetingrisks(fit, conf.int = TRUE) -ggcompetingrisks(fit, multiple_panels = FALSE, conf.int = TRUE) - -# handles survfitms objects -library(survival) -df <- data.frame(time = ss, group = gg, status = cc, strt) -fit2 <- survfit(Surv(time, status, type="mstate") ~ 1, data=df) -ggcompetingrisks(fit2) -fit3 <- survfit(Surv(time, status, type="mstate") ~ group, data=df) -ggcompetingrisks(fit3) -} - - library(ggsci) - library(cowplot) - ggcompetingrisks(fit3) + theme_cowplot() + scale_fill_jco() -}
R/ggcoxdiagnostics.R
- ggcoxdiagnostics.Rd
Displays diagnostics graphs presenting goodness of Cox Proportional Hazards Model fit, that -can be calculated with coxph function.
-ggcoxdiagnostics( - fit, - type = c("martingale", "deviance", "score", "schoenfeld", "dfbeta", "dfbetas", - "scaledsch", "partial"), - ..., - linear.predictions = type %in% c("martingale", "deviance"), - ox.scale = ifelse(linear.predictions, "linear.predictions", "observation.id"), - hline = TRUE, - sline = TRUE, - sline.se = TRUE, - hline.col = "red", - hline.size = 1, - hline.alpha = 1, - hline.yintercept = 0, - hline.lty = "dashed", - sline.col = "blue", - sline.size = 1, - sline.alpha = 0.3, - sline.lty = "dashed", - point.col = "black", - point.size = 1, - point.shape = 19, - point.alpha = 1, - title = NULL, - subtitle = NULL, - caption = NULL, - ggtheme = ggplot2::theme_bw() -)- -
fit | -an object of class coxph.object - created with coxph function. |
-
---|---|
type | -the type of residuals to present on Y axis of a diagnostic plot.
-The same as in residuals.coxph: character string indicating the type of
-residual desired. Possible values are |
-
... | -further arguments passed to |
-
linear.predictions | -(deprecated, see |
-
ox.scale | -one value from |
-
hline | -a logical - should the horizontal line be added to highlight the |
-
sline, sline.se | -a logical - should the smooth line be added to highlight the local average for residuals. |
-
hline.col, hline.size, hline.lty, hline.alpha, hline.yintercept | -color, size, linetype, visibility and Y-axis coordinate to be used for geom_hline.
-Used only when |
-
sline.col, sline.size, sline.lty, sline.alpha | -color, size, linetype and visibility to be used for geom_smooth.
-Used only when |
-
point.col, point.size, point.shape, point.alpha | -color, size, shape and visibility to be used for points. |
-
title, subtitle, caption | -main title, subtitle and caption. |
-
ggtheme | -function, ggplot2 theme name. Default value is ggplot2::theme_bw().
-Allowed values include ggplot2 official themes: see |
-
Returns an object of class ggplot
.
ggcoxdiagnostics
: Diagnostic Plots for Cox Proportional Hazards Model with ggplot2
--library(survival) -coxph.fit2 <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data=ovarian) -ggcoxdiagnostics(coxph.fit2, type = "deviance")#>-ggcoxdiagnostics(coxph.fit2, type = "schoenfeld", title = "Diagnostic plot")#>ggcoxdiagnostics(coxph.fit2, type = "deviance", ox.scale = "time")#> Warning: ox.scale='time' works only with type=schoenfeld/scaledsch#>ggcoxdiagnostics(coxph.fit2, type = "schoenfeld", ox.scale = "time", - title = "Diagnostic plot", subtitle = "Data comes from survey XYZ", - font.subtitle = 9)#>ggcoxdiagnostics(coxph.fit2, type = "deviance", ox.scale = "linear.predictions", - caption = "Code is available here - link", font.caption = 10)#>ggcoxdiagnostics(coxph.fit2, type = "schoenfeld", ox.scale = "observation.id")#>ggcoxdiagnostics(coxph.fit2, type = "scaledsch", ox.scale = "time")#>-
R/ggcoxfunctional.R
- ggcoxfunctional.Rd
Displays graphs of continuous explanatory variable against martingale residuals of null
-cox proportional hazards model, for each term in of the right side of formula
. This might help to properly
-choose the functional form of continuous variable in cox model (coxph). Fitted lines with lowess function
-should be linear to satisfy cox proportional hazards model assumptions.
ggcoxfunctional( - formula, - data = NULL, - fit, - iter = 0, - f = 0.6, - point.col = "red", - point.size = 1, - point.shape = 19, - point.alpha = 1, - xlim = NULL, - ylim = NULL, - ylab = "Martingale Residuals \nof Null Cox Model", - title = NULL, - caption = NULL, - ggtheme = theme_survminer(), - ... -) - -# S3 method for ggcoxfunctional -print(x, ..., newpage = TRUE)- -
formula | -a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. |
-
---|---|
data | -a |
-
fit | -an object of class coxph.object - created with coxph function. |
-
iter | -parameter of lowess. |
-
f | -parameter of lowess. |
-
point.col, point.size, point.shape, point.alpha | -color, size, shape and visibility to be used for points. |
-
xlim, ylim | -x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1). |
-
ylab | -y axis label. |
-
title | -the title of the final grob ( |
-
caption | -the caption of the final grob ( |
-
ggtheme | -function, ggplot2 theme name.
-Allowed values include ggplot2 official themes: see |
-
... | -further arguments passed to the function |
-
x | -an object of class ggcoxfunctional |
-
newpage | -open a new page. See |
-
Returns an object of class ggcoxfunctional
which is a list of ggplots.
ggcoxfunctional
: Functional Form of Continuous Variable in Cox Proportional Hazards Model.
--library(survival) -data(mgus) -res.cox <- coxph(Surv(futime, death) ~ mspike + log(mspike) + I(mspike^2) + - age + I(log(age)^2) + I(sqrt(age)), data = mgus) -ggcoxfunctional(res.cox, data = mgus, point.col = "blue", point.alpha = 0.5)ggcoxfunctional(res.cox, data = mgus, point.col = "blue", point.alpha = 0.5, - title = "Pass the title", caption = "Pass the caption")- -
Displays a graph of the scaled Schoenfeld residuals, along with a - smooth curve using ggplot2. Wrapper around plot.cox.zph.
-ggcoxzph( - fit, - resid = TRUE, - se = TRUE, - df = 4, - nsmo = 40, - var, - point.col = "red", - point.size = 1, - point.shape = 19, - point.alpha = 1, - caption = NULL, - ggtheme = theme_survminer(), - ... -) - -# S3 method for ggcoxzph -print(x, ..., newpage = TRUE)- -
fit | -an object of class cox.zph. |
-
---|---|
resid | -a logical value, if TRUE the residuals are included on the plot, -as well as the smooth fit. |
-
se | -a logical value, if TRUE, confidence bands at two standard errors -will be added. |
-
df | -the degrees of freedom for the fitted natural spline, df=2 leads to -a linear fit. |
-
nsmo | -number of points used to plot the fitted spline. |
-
var | -the set of variables for which plots are desired. By default, plots -are produced in turn for each variable of a model. |
-
point.col, point.size, point.shape, point.alpha | -color, size, shape and visibility to be used for points. |
-
caption | -the caption of the final grob ( |
-
ggtheme | -function, ggplot2 theme name.
-Allowed values include ggplot2 official themes: see |
-
... | -further arguments passed to either the print() function or to the |
-
x | -an object of class ggcoxzph |
-
newpage | -open a new page. See |
-
Returns an object of class ggcoxzph
which is a list of ggplots.
Customizing the plots: The plot can be easily - customized using additional arguments to be passed to the function ggpar(). - Read ?ggpubr::ggpar. These arguments include - font.main,font.submain,font.caption,font.x,font.y,font.tickslab,font.legend: - a vector of length 3 indicating respectively the size (e.g.: 14), the style - (e.g.: "plain", "bold", "italic", "bold.italic") and the color (e.g.: "red") - of main title, subtitle, caption, xlab and ylab and axis tick labels, - respectively. For example font.x = c(14, "bold", "red"). Use font.x - = 14, to change only font size; or use font.x = "bold", to change only font - face.
-ggcoxzph
: Graphical Test of Proportional Hazards using ggplot2.
--library(survival) -fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps + rx, data=ovarian) -cox.zph.fit <- cox.zph(fit) -# plot all variables -ggcoxzph(cox.zph.fit)# plot all variables in specified order -ggcoxzph(cox.zph.fit, var = c("ecog.ps", "rx", "age"), font.main = 12)# plot specified variables in specified order -ggcoxzph(cox.zph.fit, var = c("ecog.ps", "rx"), font.main = 12, caption = "Caption goes here")-
Create ggplot2-based graphs for flexible survival models.
-ggflexsurvplot( - fit, - data = NULL, - fun = c("survival", "cumhaz"), - summary.flexsurv = NULL, - size = 1, - conf.int = FALSE, - conf.int.flex = conf.int, - conf.int.km = FALSE, - legend.labs = NULL, - ... -)- -
fit | -an object of class |
-
---|---|
data | -the data used to fit survival curves. |
-
fun | -the type of survival curves. Allowed values include "survival" -(default) and "cumhaz" (for cumulative hazard). |
-
summary.flexsurv | -(optional) the summary of the |
-
size | -line size for the flexible survival estimates. |
-
conf.int, conf.int.flex | -logical. If TRUE, add confidence bands for -flexible survival estimates. |
-
conf.int.km | -same as |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
... | -additional arguments passed to the function |
-
a ggsurvplot
- --# \donttest{ -if(require("flexsurv")) { -fit <- flexsurvreg(Surv(rectime, censrec) ~ group, - dist = "gengamma", data = bc) -ggflexsurvplot(fit) -}#># } - -
Drawing Forest Plot for Cox proportional hazards model. In two panels the model structure is presented.
-ggforest( - model, - data = NULL, - main = "Hazard ratio", - cpositions = c(0.02, 0.22, 0.4), - fontsize = 0.7, - refLabel = "reference", - noDigits = 2 -)- -
model | -an object of class coxph. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
main | -title of the plot. |
-
cpositions | -relative positions of first three columns in the OX scale. |
-
fontsize | -relative size of annotations in the plot. Default value: 0.7. |
-
refLabel | -label for reference levels of factor variables. |
-
noDigits | -number of digits for estimates and p-values in the plot. |
-
returns a ggplot2 object (invisibly)
- --require("survival") -model <- coxph( Surv(time, status) ~ sex + rx + adhere, - data = colon ) -ggforest(model)#> Warning: The `data` argument is not provided. Data will be extracted from model fit.-colon <- within(colon, { - sex <- factor(sex, labels = c("female", "male")) - differ <- factor(differ, labels = c("well", "moderate", "poor")) - extent <- factor(extent, labels = c("submuc.", "muscle", "serosa", "contig.")) -}) -bigmodel <- - coxph(Surv(time, status) ~ sex + rx + adhere + differ + extent + node4, - data = colon ) -ggforest(bigmodel)#> Warning: The `data` argument is not provided. Data will be extracted from model fit.-
Distribution of Events' Times
-ggsurvevents( - surv = NULL, - fit = NULL, - data = NULL, - type = "fraction", - normalized = TRUE, - censored.on.top = TRUE, - ggtheme = theme_survminer(), - palette = c("grey75", "grey25"), - ... -)- -
surv | -an object of Surv. If not suplied, the censoring variable is extracted from the model. |
-
---|---|
fit | -an object of class survfit. |
-
data | -a dataset for predictions. If not supplied then data will be extracted from `fit` object. |
-
type | -one of |
-
normalized | -if |
-
censored.on.top | -is TRUE then censored events are on the top |
-
ggtheme | -function, ggplot2 theme name. Allowed values include ggplot2 official themes: see theme. |
-
palette | -the color palette to be used for coloring of significant variables. |
-
... | -other graphical parameters to be passed to the function ggpar. |
-
return an object of class ggplot
- --ggsurvevents(surv2, normalized = TRUE)-# from survfit -fit <- survfit(Surv(time, status) ~ sex, data = lung) -ggsurvevents(fit = fit, data = lung)#> Warning: The `surv` argument is not provided. The censored variable will be extracted from model fit.-# from coxph -model <- coxph( Surv(time, status) ~ sex + rx + adhere, data = colon ) -ggsurvevents(fit = model, data = colon)#> Warning: The `surv` argument is not provided. The censored variable will be extracted from model fit.ggsurvevents(surv2, normalized = TRUE, type = "radius")ggsurvevents(surv2, normalized = TRUE, type = "fraction")-
ggsurvplot
() is a generic function to plot survival curves. Wrapper
- around the ggsurvplot_xx()
family functions. Plot one or a list of
- survfit objects as generated by the
- survfit.formula() and surv_fit functions:
See the documentation for each function to
- learn how to control that aspect of the ggsurvplot().
- ggsurvplot
() accepts further arguments to be passed to the
- ggsurvplot_xx()
functions. Has options to:
plot a list of survfit objects,
facet survival curves into multiple - panels,
group dataset by one or two grouping variables and to create - the survival curves in each subset,
combine multiple survfit
- objects into one plot,
add survival curves of the pooled patients - (null model) onto the main stratified plot,
plot survival curves from - a data frame containing survival curve summary as returned by - surv_summary().
ggsurvplot( - fit, - data = NULL, - fun = NULL, - color = NULL, - palette = NULL, - linetype = 1, - conf.int = FALSE, - pval = FALSE, - pval.method = FALSE, - test.for.trend = FALSE, - surv.median.line = "none", - risk.table = FALSE, - cumevents = FALSE, - cumcensor = FALSE, - tables.height = 0.25, - group.by = NULL, - facet.by = NULL, - add.all = FALSE, - combine = FALSE, - ggtheme = theme_survminer(), - tables.theme = ggtheme, - ... -) - -# S3 method for ggsurvplot -print( - x, - surv.plot.height = NULL, - risk.table.height = NULL, - ncensor.plot.height = NULL, - newpage = TRUE, - ... -)- -
fit | -allowed values include:
|
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
fun | -an arbitrary function defining a transformation of the survival -curve. Often used transformations can be specified with a character -argument: "event" plots cumulative events (f(y) = 1-y), "cumhaz" plots the -cumulative hazard function (f(y) = -log(y)), and "pct" for survival -probability in percentage. |
-
color | -color to be used for the survival curves.
|
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
linetype | -line types. Allowed values includes i) "strata" for changing -linetypes by strata (i.e. groups); ii) a numeric vector (e.g., c(1, 2)) or a -character vector c("solid", "dashed"). |
-
conf.int | -logical value. If TRUE, plots confidence interval. |
-
pval | -logical value, a numeric or a string. If logical and TRUE, the -p-value is added on the plot. If numeric, than the computet p-value is -substituted with the one passed with this parameter. If character, then the -customized string appears on the plot. See examples - Example 3. |
-
pval.method | -whether to add a text with the test name used for
-calculating the pvalue, that corresponds to survival curves' comparison -
-used only when |
-
test.for.trend | -logical value. Default is FALSE. If TRUE, returns the -test for trend p-values. Tests for trend are designed to detect ordered -differences in survival curves. That is, for at least one group. The test -for trend can be only performed when the number of groups is > 2. |
-
surv.median.line | -character vector for drawing a horizontal/vertical -line at median survival. Allowed values include one of c("none", "hv", "h", -"v"). v: vertical, h:horizontal. |
-
risk.table | -Allowed values include:
|
-
cumevents | -logical value specifying whether to show or not the table of -the cumulative number of events. Default is FALSE. |
-
cumcensor | -logical value specifying whether to show or not the table of -the cumulative number of censoring. Default is FALSE. |
-
tables.height | -numeric value (in [0 - 1]) specifying the general height -of all tables under the main survival plot. |
-
group.by | -a character vector containing the name of grouping variables. Should be of length <= 2.
-Alias of the |
-
facet.by | -a character vector containing the name of grouping variables
-to facet the survival curves into multiple panels. Should be of length <= 2.
-Alias of the |
-
add.all | -a logical value. If TRUE, add the survival curve of pooled patients (null model) onto the main plot.
-Alias of the |
-
combine | -a logical value. If TRUE, combine a list survfit objects on the same plot.
-Alias of the |
-
ggtheme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
tables.theme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
... | -Futher arguments as described hereafter and -other arguments to be passed i) to ggplot2 geom_*() functions such - as linetype, size, ii) or to the function ggpar() for - customizing the plots. See details section. |
-
x | -an object of class ggsurvplot |
-
surv.plot.height | -the height of the survival plot on the grid. Default -is 0.75. Ignored when risk.table = FALSE. |
-
risk.table.height | -the height of the risk table on the grid. Increase -the value when you have many strata. Default is 0.25. Ignored when -risk.table = FALSE. |
-
ncensor.plot.height | -The height of the censor plot. Used when
- |
-
newpage | -open a new page. See |
-
return an object of class ggsurvplot which is list containing the - following components:
plot: the survival plot (ggplot - object)
table: the number of subjects at risk table per time (ggplot - object).
cumevents: the cumulative number of events table (ggplot - object).
ncensor.plot: the number of censoring (ggplot object).
data.survplot: the data used to plot the survival curves (data.frame).
data.survtable: the data used to plot the tables under the main survival - curves (data.frame).
Color palettes: The argument palette can be used to
- specify the color to be used for each group. By default, the first color in
- the palette is used to color the first level of the factor variable. This
- default behavior can be changed by assigning correctly a named vector. That
- is, the names of colors should match the strata names as generated by the
- ggsurvplot()
function in the legend.
Customize survival plots and tables. See also ggsurvplot_arguments.
-title: main title.
xlab, ylab: x and y axis labels, respectively.
legend: character specifying legend position. Allowed values are one of - c("top", "bottom", "left", "right", "none"). Default is "top" side position. - to remove the legend use legend = "none". Legend position can be also - specified using a numeric vector c(x, y). In this case it is - possible to position the legend inside the plotting area. x and y are the - coordinates of the legend box. Their values should be between 0 and 1. - c(0,0) corresponds to the "bottom left" and c(1,1) corresponds to the "top - right" position. For instance use legend = c(0.8, 0.2).
legend.title: legend title.
legend.labs: character vector specifying legend labels. Used to replace - the names of the strata from the fit. Should be given in the same order as - those strata.
break.time.by: numeric value controlling time axis breaks. Default value - is NULL.
break.x.by: alias of break.time.by. Numeric value controlling x axis - breaks. Default value is NULL.
break.y.by: same as break.x.by but for y axis.
surv.scale: scale transformation of survival curves. Allowed values are - "default" or "percent".
xscale: numeric or character value specifying x-axis scale.
If numeric, the value is used to divide the labels on the x axis. For - example, a value of 365.25 will give labels in years instead of the original - days.
If character, allowed options include one of - "d_m", "d_y",
- "m_d", "m_y", "y_d" and "y_m" - where d = days
, m = months
and y = years
. For
- example, xscale = "d_m"
will transform labels from days to months; xscale =
- "m_y"
, will transform labels from months to years.
xlim,ylim: x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1).
axes.offset: logical value. Default is TRUE. If FALSE, set the plot axes to start at the origin.
conf.int.fill: fill color to be used for confidence interval.
conf.int.style: confidence interval style. Allowed values include c("ribbon", "step").
conf.int.alpha: numeric value specifying confidence fill color transparency. - Value should be in [0, 1], where 0 is full transparency and 1 is no transparency.
pval.size: numeric value specifying the p-value text size. Default is 5.
pval.coord: numeric vector, of length 2, - specifying the x and y coordinates of the p-value. - Default values are NULL.
pval.method.size: the same as pval.size
but for displaying
- log.rank.weights
name.
pval.method.coord: the same as pval.coord
but for displaying
- log.rank.weights
name.
log.rank.weights: the name for the type of weights to be used in
- computing the p-value for log-rank test. By default survdiff
is used
- to calculate regular log-rank test (with weights == 1). A user can specify
- "1", "n", "sqrtN", "S1", "S2", "FH"
to use weights specified in
- comp, so that weight correspond to the test as : 1 -
- log-rank, n - Gehan-Breslow (generalized Wilcoxon), sqrtN - Tarone-Ware, S1
- - Peto-Peto's modified survival estimate, S2 - modified Peto-Peto (by
- Andersen), FH - Fleming-Harrington(p=1, q=1).
surv.median.line: character vector for drawing a - horizontal/vertical line at median survival. - Allowed values include one of c("none", "hv", "h", "v"). v: vertical, h:horizontal.
censor: logical value. If TRUE (default), censors will be drawn.
censor.shape: character or numeric value specifying the point shape of censors. - Default value is "+" (3), a sensible choice is "|" (124).
censor.size: numveric value specifying the point size of censors. Default is 4.5.
General parameters for all tables. - The arguments below, when specified, will be applied to all survival tables at once - (risk, cumulative events and cumulative censoring tables).
tables.col: color to be used for all tables under the main plot. Default - value is "black". If you want to color by strata (i.e. groups), use - tables.col = "strata".
fontsize: font size to be used for the risk table - and the cumulative events table.
font.family: character vector specifying text element font family, - e.g.: font.family = "Courier New".
tables.y.text: logical. Default is TRUE. If FALSE, the y axis tick - labels of tables will be hidden.
tables.y.text.col: logical. Default value is FALSE. If TRUE, the y - tick labels of tables will be colored by strata.
tables.height: numeric value (in [0 - 1]) specifying the general height - of all tables under the main survival plot. - Increase the value when you have many strata. Default is 0.25.
Specific to the risk table
risk.table.title: the title to be used for the risk table.
risk.table.pos: character vector specifying the risk table position. - Allowed options are one of c("out", "in") indicating 'outside' or 'inside' - the main plot, respectively. Default value is "out".
risk.table.col
, risk.table.fontsize
, risk.table.y.text
,
- risk.table.y.text.col
and risk.table.height
: same as for the general parameters
- but applied to the risk table only.
Specific to the number of cumulative events table (cumevents)
cumevents.title: the title to be used for the cumulative events table.
cumevents.col, cumevents.y.text, cumevents.y.text, cumevents.height
:
- same as for the general parameters but for the cumevents table only.
Specific to the number of cumulative censoring table (cumcensor)
cumcensor.title: the title to be used for the cumcensor table.
cumcensor.col
, cumcensor.y.text
, cumcensor.y.text.col
, cumcensor.height
:
- same as for the general parameters but for cumcensor table only.
surv.plot.height: the height of the survival plot on the grid. Default - is 0.75. Ignored when risk.table = FALSE.
ncensor.plot: logical value. If TRUE, the number of censored subjects at - time t is plotted. Default is FALSE. Ignored when cumcensor = TRUE.
ncensor.plot.title: the title to be used for the censor plot. Used when
- ncensor.plot = TRUE
.
ncensor.plot.height: the height of the censor plot. Used when
- ncensor.plot = TRUE
.
The plot can be easily customized using additional arguments to be
- passed to the function ggpar()
.
These arguments include
- font.title, font.subtitle, font.caption, font.x, font.y, font.tickslab and font.legend
,
- which are vectors of length 3 indicating respectively the size
- (e.g.: 14), the style (e.g.: "plain", "bold", "italic", "bold.italic") and
- the color (e.g.: "red") of main title, subtitle, caption, xlab and ylab,
- axis tick labels and legend, respectively. For example font.x = c(14,
- "bold", "red").
Use font.x = 14, to change only font size; or use font.x = - "bold", to change only font face.
- ---#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -# Example 1: Survival curves with two groups -#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -# Fit survival curves -#++++++++++++++++++++++++++++++++++++ -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Basic survival curves -ggsurvplot(fit, data = lung)-# Customized survival curves -ggsurvplot(fit, data = lung, - surv.median.line = "hv", # Add medians survival - - # Change legends: title & labels - legend.title = "Sex", - legend.labs = c("Male", "Female"), - # Add p-value and tervals - pval = TRUE, - - conf.int = TRUE, - # Add risk table - risk.table = TRUE, - tables.height = 0.2, - tables.theme = theme_cleantable(), - - # Color palettes. Use custom color: c("#E7B800", "#2E9FDF"), - # or brewer color (e.g.: "Dark2"), or ggsci color (e.g.: "jco") - palette = c("#E7B800", "#2E9FDF"), - ggtheme = theme_bw() # Change ggplot2 theme -)-# Change font size, style and color -#++++++++++++++++++++++++++++++++++++ -if (FALSE) { -# Change font size, style and color at the same time -ggsurvplot(fit, data = lung, main = "Survival curve", - font.main = c(16, "bold", "darkblue"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen")) -} - - - -#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -# Example 2: Facet ggsurvplot() output by -# a combination of factors -#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -# Fit (complexe) survival curves -#++++++++++++++++++++++++++++++++++++ -if (FALSE) { -require("survival") -fit3 <- survfit( Surv(time, status) ~ sex + rx + adhere, - data = colon ) - -# Visualize -#++++++++++++++++++++++++++++++++++++ -ggsurv <- ggsurvplot(fit3, data = colon, - fun = "cumhaz", conf.int = TRUE, - risk.table = TRUE, risk.table.col="strata", - ggtheme = theme_bw()) - -# Faceting survival curves -curv_facet <- ggsurv$plot + facet_grid(rx ~ adhere) -curv_facet - -# Faceting risk tables: -# Generate risk table for each facet plot item -ggsurv$table + facet_grid(rx ~ adhere, scales = "free")+ - theme(legend.position = "none") - - # Generate risk table for each facet columns -tbl_facet <- ggsurv$table + facet_grid(.~ adhere, scales = "free") -tbl_facet + theme(legend.position = "none") - -# Arrange faceted survival curves and risk tables -g2 <- ggplotGrob(curv_facet) -g3 <- ggplotGrob(tbl_facet) -min_ncol <- min(ncol(g2), ncol(g3)) -g <- gridExtra::gtable_rbind(g2[, 1:min_ncol], g3[, 1:min_ncol], size="last") -g$widths <- grid::unit.pmax(g2$widths, g3$widths) -grid::grid.newpage() -grid::grid.draw(g) - -} - -#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -# Example 3: CUSTOMIZED PVALUE -#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -# Customized p-value -ggsurvplot(fit, data = lung, pval = TRUE)ggsurvplot(fit, data = lung, pval = 0.03)ggsurvplot(fit, data = lung, pval = "The hot p-value is: 0.031")-
R/ggsurvplot_add_all.R
- ggsurvplot_add_all.Rd
Add survival curves of pooled patients onto the main plot stratified by grouping variables.
-ggsurvplot_add_all( - fit, - data, - legend.title = "Strata", - legend.labs = NULL, - pval = FALSE, - ... -)- -
fit | -an object of class survfit. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
legend.title | -legend title. |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
pval | -logical value, a numeric or a string. If logical and TRUE, the -p-value is added on the plot. If numeric, than the computet p-value is -substituted with the one passed with this parameter. If character, then the -customized string appears on the plot. See examples - Example 3. |
-
... | -other arguments passed to the |
-
Return a ggsurvplot.
--library(survival) - -# Fit survival curves -fit <- surv_fit(Surv(time, status) ~ sex, data = lung) - -# Visualize survival curves -ggsurvplot(fit, data = lung, - risk.table = TRUE, pval = TRUE, - surv.median.line = "hv", palette = "jco")-# Add survival curves of pooled patients (Null model) -# Use add.all = TRUE option -ggsurvplot(fit, data = lung, - risk.table = TRUE, pval = TRUE, - surv.median.line = "hv", palette = "jco", - add.all = TRUE)-
ggsurvplot Argument Descriptions
-fit | -an object of class survfit. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
fun | -an arbitrary function defining a transformation of the survival -curve. Often used transformations can be specified with a character -argument: "event" plots cumulative events (f(y) = 1-y), "cumhaz" plots the -cumulative hazard function (f(y) = -log(y)), and "pct" for survival -probability in percentage. |
-
surv.scale | -scale transformation of survival curves. Allowed values are -"default" or "percent". |
-
xscale | -numeric or character value specifying x-axis scale.
|
-
color | -color to be used for the survival curves.
|
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
linetype | -line types. Allowed values includes i) "strata" for changing -linetypes by strata (i.e. groups); ii) a numeric vector (e.g., c(1, 2)) or a -character vector c("solid", "dashed"). |
-
break.time.by | -numeric value controlling time axis breaks. Default value -is NULL. |
-
break.x.by | -alias of break.time.by. Numeric value controlling x axis -breaks. Default value is NULL. |
-
break.y.by | -same as break.x.by but for y axis. |
-
conf.int | -logical value. If TRUE, plots confidence interval. |
-
conf.int.fill | -fill color to be used for confidence interval. |
-
conf.int.style | -confidence interval style. Allowed values include -c("ribbon", "step"). |
-
conf.int.alpha | -numeric value specifying fill color transparency. Value -should be in [0, 1], where 0 is full transparency and 1 is no transparency. |
-
censor | -logical value. If TRUE, censors will be drawn. |
-
censor.shape | -character or numeric value specifying the point shape of -censors. Default value is "+" (3), a sensible choice is "|" (124). |
-
censor.size | -numveric value specifying the point size of censors. -Default is 4.5. |
-
pval | -logical value, a numeric or a string. If logical and TRUE, the -p-value is added on the plot. If numeric, than the computet p-value is -substituted with the one passed with this parameter. If character, then the -customized string appears on the plot. See examples - Example 3. |
-
pval.size | -numeric value specifying the p-value text size. Default is 5. |
-
pval.coord | -numeric vector, of length 2, specifying the x and y -coordinates of the p-value. Default values are NULL. |
-
title, xlab, ylab | -main title and axis labels |
-
xlim, ylim | -x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1). |
-
axes.offset | -logical value. Default is TRUE. If FALSE, set the plot axes -to start at the origin. |
-
legend | -character specifying legend position. Allowed values are one of -c("top", "bottom", "left", "right", "none"). Default is "top" side position. -to remove the legend use legend = "none". Legend position can be also -specified using a numeric vector c(x, y); see details section. |
-
legend.title | -legend title. |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
risk.table | -Allowed values include:
|
-
risk.table.title | -The title to be used for the risk table. |
-
risk.table.pos | -character vector specifying the risk table position. -Allowed options are one of c("out", "in") indicating 'outside' or 'inside' -the main plot, respectively. Default value is "out". |
-
risk.table.col | -same as tables.col but for risk table only. |
-
risk.table.fontsize, fontsize | -font size to be used for the risk table -and the cumulative events table. |
-
risk.table.y.text | -logical. Default is TRUE. If FALSE, risk table y axis -tick labels will be hidden. |
-
risk.table.y.text.col | -logical. Default value is FALSE. If TRUE, risk -table tick labels will be colored by strata. |
-
tables.height | -numeric value (in [0 - 1]) specifying the general height -of all tables under the main survival plot. |
-
tables.y.text | -logical. Default is TRUE. If FALSE, the y axis tick -labels of tables will be hidden. |
-
tables.y.text.col | -logical. Default value is FALSE. If TRUE, tables tick -labels will be colored by strata. |
-
tables.col | -color to be used for all tables under the main plot. Default -value is "black". If you want to color by strata (i.e. groups), use -tables.col = "strata". |
-
tables.theme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
risk.table.height | -the height of the risk table on the grid. Increase -the value when you have many strata. Default is 0.25. Ignored when -risk.table = FALSE. |
-
surv.plot.height | -the height of the survival plot on the grid. Default -is 0.75. Ignored when risk.table = FALSE. |
-
ncensor.plot | -logical value. If TRUE, the number of censored subjects at -time t is plotted. Default is FALSE. Ignored when cumcensor = TRUE. |
-
ncensor.plot.title | -The title to be used for the censor plot. Used when
- |
-
ncensor.plot.height | -The height of the censor plot. Used when
- |
-
cumevents | -logical value specifying whether to show or not the table of -the cumulative number of events. Default is FALSE. |
-
cumevents.title | -The title to be used for the cumulative events table. |
-
cumevents.col | -same as tables.col but for the cumulative events table -only. |
-
cumevents.y.text | -logical. Default is TRUE. If FALSE, the y axis tick -labels of the cumulative events table will be hidden. |
-
cumevents.y.text.col | -logical. Default value is FALSE. If TRUE, the y -tick labels of the cumulative events will be colored by strata. |
-
cumevents.height | -the height of the cumulative events table on the grid. -Default is 0.25. Ignored when cumevents = FALSE. |
-
cumcensor | -logical value specifying whether to show or not the table of -the cumulative number of censoring. Default is FALSE. |
-
cumcensor.title | -The title to be used for the cumcensor table. |
-
cumcensor.col | -same as tables.col but for cumcensor table only. |
-
cumcensor.y.text | -logical. Default is TRUE. If FALSE, the y axis tick -labels of the cumcensor table will be hidden. |
-
cumcensor.y.text.col | -logical. Default value is FALSE. If TRUE, the y -tick labels of the cumcensor will be colored by strata. |
-
cumcensor.height | -the height of the cumcensor table on the grid. Default -is 0.25. Ignored when cumcensor = FALSE. |
-
surv.median.line | -character vector for drawing a horizontal/vertical -line at median survival. Allowed values include one of c("none", "hv", "h", -"v"). v: vertical, h:horizontal. |
-
ggtheme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
... | -other arguments to be passed i) to ggplot2 geom_*() functions such -as linetype, size, ii) or to the function ggpar() for -customizing the plots. See details section. |
-
log.rank.weights | -The name for the type of weights to be used in
-computing the p-value for log-rank test. By default |
-
pval.method | -whether to add a text with the test name used for
-calculating the pvalue, that corresponds to survival curves' comparison -
-used only when |
-
pval.method.size | -the same as |
-
pval.method.coord | -the same as |
-
R/ggsurvplot_combine.R
- ggsurvplot_combine.Rd
Combine multiple survfit objects on the same plot. For example,
- one might wish to plot progression free survival and overall survival on
- the same graph (and also stratified by treatment assignment).
- ggsurvplot_combine()
provides an extension to the
- ggsurvplot()
function for doing that.
ggsurvplot_combine( - fit, - data, - risk.table = FALSE, - risk.table.pos = c("out", "in"), - cumevents = FALSE, - cumcensor = FALSE, - tables.col = "black", - tables.y.text = TRUE, - tables.y.text.col = TRUE, - ggtheme = theme_survminer(), - tables.theme = ggtheme, - keep.data = FALSE, - risk.table.y.text = tables.y.text, - ... -)- -
fit | -a named list of survfit objects. |
-
---|---|
data | -the data frame used to compute survival curves. |
-
risk.table | -Allowed values include:
|
-
risk.table.pos | -character vector specifying the risk table position. -Allowed options are one of c("out", "in") indicating 'outside' or 'inside' -the main plot, respectively. Default value is "out". |
-
cumevents | -logical value specifying whether to show or not the table of -the cumulative number of events. Default is FALSE. |
-
cumcensor | -logical value specifying whether to show or not the table of -the cumulative number of censoring. Default is FALSE. |
-
tables.col | -color to be used for all tables under the main plot. Default -value is "black". If you want to color by strata (i.e. groups), use -tables.col = "strata". |
-
tables.y.text | -logical. Default is TRUE. If FALSE, the y axis tick -labels of tables will be hidden. |
-
tables.y.text.col | -logical. Default value is FALSE. If TRUE, tables tick -labels will be colored by strata. |
-
ggtheme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
tables.theme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
keep.data | -logical value specifying whether the plot data frame should be kept in the result. -Setting these to FALSE (default) can give much smaller results and hence even save memory allocation time. |
-
risk.table.y.text | -logical. Default is TRUE. If FALSE, risk table y axis -tick labels will be hidden. |
-
... | -other arguments to pass to the |
-
-library(survival) -# Create a demo data set -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: - set.seed(123) - demo.data <- data.frame( - os.time = colon$time, - os.status = colon$status, - pfs.time = sample(colon$time), - pfs.status = colon$status, - sex = colon$sex, rx = colon$rx, adhere = colon$adhere - ) - -# Ex1: Combine null models -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: - # Fit - pfs <- survfit( Surv(pfs.time, pfs.status) ~ 1, data = demo.data) - os <- survfit( Surv(os.time, os.status) ~ 1, data = demo.data) - # Combine on the same plot - fit <- list(PFS = pfs, OS = os) - ggsurvplot_combine(fit, demo.data)#> Warning: `select_()` is deprecated as of dplyr 0.7.0. -#> Please use `select()` instead. -#> This warning is displayed once every 8 hours. -#> Call `lifecycle::last_warnings()` to see where this warning was generated.-# Combine survival curves stratified by treatment assignment rx -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -# Fit -pfs <- survfit( Surv(pfs.time, pfs.status) ~ rx, data = demo.data) -os <- survfit( Surv(os.time, os.status) ~ rx, data = demo.data) -# Combine on the same plot -fit <- list(PFS = pfs, OS = os) -ggsurvplot_combine(fit, demo.data)-
R/ggsurvplot_df.R
- ggsurvplot_df.Rd
An extension to ggsurvplot() to plot survival curves from - any data frame containing the summary of survival curves as returned the - surv_summary() function.
-Might be useful for a user who wants - to use ggsurvplot for visualizing survival curves computed by another - method than the standard survfit.formula function. In this - case, the user has just to provide the data frame containing the summary of - the survival analysis.
-ggsurvplot_df( - fit, - fun = NULL, - color = NULL, - palette = NULL, - linetype = 1, - break.x.by = NULL, - break.time.by = NULL, - break.y.by = NULL, - surv.scale = c("default", "percent"), - surv.geom = geom_step, - xscale = 1, - conf.int = FALSE, - conf.int.fill = "gray", - conf.int.style = "ribbon", - conf.int.alpha = 0.3, - censor = TRUE, - censor.shape = "+", - censor.size = 4.5, - title = NULL, - xlab = "Time", - ylab = "Survival probability", - xlim = NULL, - ylim = NULL, - axes.offset = TRUE, - legend = c("top", "bottom", "left", "right", "none"), - legend.title = "Strata", - legend.labs = NULL, - ggtheme = theme_survminer(), - ... -)- -
fit | -a data frame as returned by surv_summary. Should contains at least -the following columns:
|
-
---|---|
fun | -an arbitrary function defining a transformation of the survival -curve. Often used transformations can be specified with a character -argument: "event" plots cumulative events (f(y) = 1-y), "cumhaz" plots the -cumulative hazard function (f(y) = -log(y)), and "pct" for survival -probability in percentage. |
-
color | -color to be used for the survival curves.
|
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
linetype | -line types. Allowed values includes i) "strata" for changing -linetypes by strata (i.e. groups); ii) a numeric vector (e.g., c(1, 2)) or a -character vector c("solid", "dashed"). |
-
break.x.by | -alias of break.time.by. Numeric value controlling x axis -breaks. Default value is NULL. |
-
break.time.by | -numeric value controlling time axis breaks. Default value -is NULL. |
-
break.y.by | -same as break.x.by but for y axis. |
-
surv.scale | -scale transformation of survival curves. Allowed values are -"default" or "percent". |
-
surv.geom | -survival curve style. Is the survival curve entered a step -function (geom_step) or a smooth function (geom_line). |
-
xscale | -numeric or character value specifying x-axis scale.
|
-
conf.int | -logical value. If TRUE, plots confidence interval. |
-
conf.int.fill | -fill color to be used for confidence interval. |
-
conf.int.style | -confidence interval style. Allowed values include -c("ribbon", "step"). |
-
conf.int.alpha | -numeric value specifying fill color transparency. Value -should be in [0, 1], where 0 is full transparency and 1 is no transparency. |
-
censor | -logical value. If TRUE, censors will be drawn. |
-
censor.shape | -character or numeric value specifying the point shape of -censors. Default value is "+" (3), a sensible choice is "|" (124). |
-
censor.size | -numveric value specifying the point size of censors. -Default is 4.5. |
-
title | -main title and axis labels |
-
xlab | -main title and axis labels |
-
ylab | -main title and axis labels |
-
xlim | -x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1). |
-
ylim | -x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1). |
-
axes.offset | -logical value. Default is TRUE. If FALSE, set the plot axes -to start at the origin. |
-
legend | -character specifying legend position. Allowed values are one of -c("top", "bottom", "left", "right", "none"). Default is "top" side position. -to remove the legend use legend = "none". Legend position can be also -specified using a numeric vector c(x, y); see details section. |
-
legend.title | -legend title. |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
ggtheme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
... | -other arguments to be passed i) to ggplot2 geom_*() functions such -as linetype, size, ii) or to the function ggpar() for -customizing the plots. See details section. |
-
-library(survival) - -# Fit survival curves -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit1 <- survfit( Surv(time, status) ~ 1, data = colon) -fit2 <- survfit( Surv(time, status) ~ adhere, data = colon) - -# Summary -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -head(surv_summary(fit1, colon))#> time n.risk n.event n.censor surv std.err upper lower -#> 1 8 1858 1 0 0.9994618 0.0005383580 1.0000000 0.9984077 -#> 2 9 1857 1 0 0.9989236 0.0007615583 1.0000000 0.9974337 -#> 3 19 1856 1 0 0.9983854 0.0009329660 1.0000000 0.9965614 -#> 4 20 1855 1 0 0.9978471 0.0010775868 0.9999569 0.9957419 -#> 5 23 1854 1 1 0.9973089 0.0012051037 0.9996673 0.9949561 -#> 6 24 1852 1 1 0.9967704 0.0013206006 0.9993537 0.9941938#> time n.risk n.event n.censor surv std.err upper lower -#> 1 8 1588 1 0 0.9993703 0.0006299213 1.0000000 0.9981372 -#> 2 9 1587 1 0 0.9987406 0.0008911240 1.0000000 0.9969977 -#> 3 19 1586 1 0 0.9981108 0.0010917438 1.0000000 0.9959774 -#> 4 20 1585 1 0 0.9974811 0.0012610351 0.9999495 0.9950188 -#> 5 23 1584 1 1 0.9968514 0.0014103253 0.9996107 0.9940997 -#> 6 24 1582 1 1 0.9962213 0.0015455856 0.9992437 0.9932080 -#> strata adhere -#> 1 adhere=0 0 -#> 2 adhere=0 0 -#> 3 adhere=0 0 -#> 4 adhere=0 0 -#> 5 adhere=0 0 -#> 6 adhere=0 0-# Visualize -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -ggsurvplot_df(surv_summary(fit1, colon))-ggsurvplot_df(surv_summary(fit2, colon), conf.int = TRUE, - legend.title = "Adhere", legend.labs = c("0", "1"))-# Kaplan-Meier estimate -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -out_km <- survfit(Surv(time, status) ~ 1, data = lung) - -# Weibull model -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -wb <- survreg(Surv(time, status) ~ 1, data = lung) -s <- seq(.01, .99, by = .01) -t <- predict(wb, type = "quantile", p = s, newdata = lung[1, ]) -out_wb <- data.frame(time = t, surv = 1 - s, upper = NA, lower = NA, std.err = NA) - -# plot both -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -p_km <- ggsurvplot(out_km, conf.int = FALSE) -p_wb <- ggsurvplot(out_wb, conf.int = FALSE, surv.geom = geom_line) - -p_kmp_wbp_km$plot + geom_line(data = out_wb, aes(x = time, y = surv))-
Draw multi-panel survival curves of a data set grouped by one or - two variables.
-ggsurvplot_facet( - fit, - data, - facet.by, - color = NULL, - palette = NULL, - legend.labs = NULL, - pval = FALSE, - pval.method = FALSE, - pval.coord = NULL, - pval.method.coord = NULL, - nrow = NULL, - ncol = NULL, - scales = "fixed", - short.panel.labs = FALSE, - panel.labs = NULL, - panel.labs.background = list(color = NULL, fill = NULL), - panel.labs.font = list(face = NULL, color = NULL, size = NULL, angle = NULL), - panel.labs.font.x = panel.labs.font, - panel.labs.font.y = panel.labs.font, - ... -)- -
fit | -an object of class survfit. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
facet.by | -character vector, of length 1 or 2, specifying grouping -variables for faceting the plot. Should be in the data. |
-
color | -color to be used for the survival curves.
|
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
pval | -logical value, a numeric or a string. If logical and TRUE, the -p-value is added on the plot. If numeric, than the computet p-value is -substituted with the one passed with this parameter. If character, then the -customized string appears on the plot. See examples - Example 3. |
-
pval.method | -whether to add a text with the test name used for
-calculating the pvalue, that corresponds to survival curves' comparison -
-used only when |
-
pval.coord | -numeric vector, of length 2, specifying the x and y -coordinates of the p-value. Default values are NULL. |
-
pval.method.coord | -the same as |
-
nrow, ncol | -Number of rows and columns in the pannel. Used only when the -data is faceted by one grouping variable. |
-
scales | -should axis scales of panels be fixed ("fixed", the default), -free ("free"), or free in one dimension ("free_x", "free_y"). |
-
short.panel.labs | -logical value. Default is FALSE. If TRUE, create short -labels for panels by omitting variable names; in other words panels will be -labelled only by variable grouping levels. |
-
panel.labs | -a list of one or two character vectors to modify facet label -text. For example, panel.labs = list(sex = c("Male", "Female")) specifies -the labels for the "sex" variable. For two grouping variables, you can use -for example panel.labs = list(sex = c("Male", "Female"), rx = c("Obs", -"Lev", "Lev2") ). |
-
panel.labs.background | -a list to customize the background of panel -labels. Should contain the combination of the following elements:
For example, -panel.labs.background = list(color = "blue", fill = "pink"). |
-
panel.labs.font | -a list of aestheics indicating the size (e.g.: 14), the -face/style (e.g.: "plain", "bold", "italic", "bold.italic") and the color -(e.g.: "red") and the orientation angle (e.g.: 45) of panel labels. |
-
panel.labs.font.x, panel.labs.font.y | -same as panel.labs.font but for x -and y direction, respectively. |
-
... | -other arguments to pass to the function |
-
-library(survival) - -# Facet by one grouping variables: rx -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit <- survfit( Surv(time, status) ~ sex, data = colon ) -ggsurvplot_facet(fit, colon, facet.by = "rx", - palette = "jco", pval = TRUE)#> Warning: `as.tibble()` is deprecated as of tibble 2.0.0. -#> Please use `as_tibble()` instead. -#> The signature and semantics have changed, see `?as_tibble`. -#> This warning is displayed once every 8 hours. -#> Call `lifecycle::last_warnings()` to see where this warning was generated.-# Facet by two grouping variables: rx and adhere -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -ggsurvplot_facet(fit, colon, facet.by = c("rx", "adhere"), - palette = "jco", pval = TRUE)- -# Another fit -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit2 <- survfit( Surv(time, status) ~ sex + rx, data = colon ) -ggsurvplot_facet(fit2, colon, facet.by = "adhere", - palette = "jco", pval = TRUE)-
Survival curves of grouped data sets by one or two - variables.
-Survival analysis are often done on subsets defined by - variables in the dataset. For example, assume that we have a cohort of - patients with a large number of clinicopathological and molecular - covariates, including survival data, TP53 mutation status and the patients' - sex (Male or Female).
-One might be also interested in comparing the - survival curves of Male and Female after grouping (or splitting ) the data - by TP53 mutation status.
-ggsurvplot_group_by
() provides a
- convenient solution to create a multiple ggsurvplot of a data set
- grouped by one or two variables.
ggsurvplot_group_by(fit, data, group.by, ...)- -
fit | -a survfit object. |
-
---|---|
data | -a data frame used to fit survival curves. |
-
group.by | -a character vector containing the name of grouping variables. Should be of length <= 2. |
-
... | -... other arguments passed to the core function
- |
-
Retuns a list of ggsurvplots.
-ggsurvplot_group_by
() works as follow:
Create a grouped data sets using the function surv_group_by()
, --> list of data sets
Map surv_fit()
to each nested data --> Returns a list of survfit objects
Map ggsurvplot()
to each survfit object --> list of survfit ggsurvplots
One can (optionally) arrange the list of ggsurvplots using arrange_ggsurvplots()
-# Fit survival curves -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -library(survival) -fit <- survfit( Surv(time, status) ~ sex, data = colon ) - -# Visualize: grouped by treatment rx -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -ggsurv.list <- ggsurvplot_group_by(fit, colon, group.by = "rx", risk.table = TRUE, - pval = TRUE, conf.int = TRUE, palette = "jco") -names(ggsurv.list)#> [1] "rx.Obs::sex" "rx.Lev::sex" "rx.Lev+5FU::sex"- -# Visualize: grouped by treatment rx and adhere -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -ggsurv.list <- ggsurvplot_group_by(fit, colon, group.by = c("rx", "adhere"), - risk.table = TRUE, - pval = TRUE, conf.int = TRUE, palette = "jco") - -names(ggsurv.list)#> [1] "rx:Obs, adhere:0::sex" "rx:Obs, adhere:1::sex" -#> [3] "rx:Lev, adhere:0::sex" "rx:Lev, adhere:1::sex" -#> [5] "rx:Lev+5FU, adhere:0::sex" "rx:Lev+5FU, adhere:1::sex"
Take a list of survfit objects and produce a list of
- ggsurvplots
.
ggsurvplot_list( - fit, - data, - title = NULL, - legend.labs = NULL, - legend.title = "Strata", - ... -)- -
fit | -a list of survfit objects. |
-
---|---|
data | -data used to fit survival curves. Can be also a list of same
-length than |
-
title | -title of the plot. Can be a character vector or a list of titles
-of same length than |
-
legend.labs | -character vector specifying legend labels. Used to replace
-the names of the strata from the fit. Should be given in the same order as
-those strata. Can be a list when |
-
legend.title | -legend title for each plot. Can be a character vector or a -list of titles of same length than fit. |
-
... | -other arguments passed to the core function
- |
-
Returns a list of ggsurvplots.
---library(survival) - -# Create a list of formulas -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -data(colon) -f1 <- survfit(Surv(time, status) ~ adhere, data = colon) -f2 <- survfit(Surv(time, status) ~ rx, data = colon) -fits <- list(sex = f1, rx = f2) - -# Visualize -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -legend.title <- list("sex", "rx") -ggsurvplot_list(fits, colon, legend.title = legend.title)#> $sex#> -#> $rx#> -#> attr(,"class") -#> [1] "list" "ggsurvplot_list"-
Plot survival tables:
ggrisktable()
: Plot the number at risk table.
ggcumevents()
: Plot the cumulative number of events table.
ggcumcensor()
: Plot the cumulative number of censored subjects, the number of subjects who
- exit the risk set, without an event, at time t. Normally, users don't need
- to use this function directly.
ggsurvtable()
: Generic function to plot any survival tables.
Normally, users don't need to use this function directly. Internally used by the function
- ggsurvplot
.
ggrisktable( - fit, - data = NULL, - risk.table.type = c("absolute", "percentage", "abs_pct", "nrisk_cumcensor", - "nrisk_cumevents"), - ... -) - -ggcumevents(fit, data = NULL, ...) - -ggcumcensor(fit, data = NULL, ...) - -ggsurvtable( - fit, - data = NULL, - survtable = c("cumevents", "cumcensor", "risk.table"), - risk.table.type = c("absolute", "percentage", "abs_pct", "nrisk_cumcensor", - "nrisk_cumevents"), - title = NULL, - risk.table.title = NULL, - cumevents.title = title, - cumcensor.title = title, - color = "black", - palette = NULL, - break.time.by = NULL, - xlim = NULL, - xscale = 1, - xlab = "Time", - ylab = "Strata", - xlog = FALSE, - legend = "top", - legend.title = "Strata", - legend.labs = NULL, - y.text = TRUE, - y.text.col = TRUE, - fontsize = 4.5, - font.family = "", - axes.offset = TRUE, - ggtheme = theme_survminer(), - tables.theme = ggtheme, - ... -)- -
fit | -an object of class survfit. Can be a list containing two -components: 1) time: time variable used in survfit; 2) table: survival table -as generated by the internal function .get_timepoints_survsummary(). Can be -also a simple data frame. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
risk.table.type | -risk table type. Allowed values include: "absolute" or -"percentage": to show the absolute number and the percentage -of subjects at risk by time, respectively. Use "abs_pct" to show both -absolute number and percentage. Used only when survtable = "risk.table". |
-
... | -other arguments passed to the function |
-
survtable | -a character string specifying the type of survival table to plot. |
-
title | -the title of the plot. |
-
risk.table.title | -The title to be used for the risk table. |
-
cumevents.title | -The title to be used for the cumulative events table. |
-
cumcensor.title | -The title to be used for the cumcensor table. |
-
color | -color to be used for the survival curves.
|
-
palette | -the color palette to be used. Allowed values include "hue" for -the default hue color scale; "grey" for grey color palettes; brewer palettes -e.g. "RdBu", "Blues", ...; or custom color palette e.g. c("blue", "red"); and scientific journal palettes from ggsci R package, e.g.: "npg", - "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and - "rickandmorty". -See details section for more information. Can be also a numeric vector of -length(groups); in this case a basic color palette is created using the -function palette. |
-
break.time.by | -numeric value controlling time axis breaks. Default value -is NULL. |
-
xlim | -x and y axis limits e.g. xlim = c(0, 1000), ylim = c(0, 1). |
-
xscale | -numeric or character value specifying x-axis scale.
|
-
xlab | -main title and axis labels |
-
ylab | -main title and axis labels |
-
xlog | -logical value. If TRUE, x axis is tansformed into log scale. |
-
legend | -character specifying legend position. Allowed values are one of -c("top", "bottom", "left", "right", "none"). Default is "top" side position. -to remove the legend use legend = "none". Legend position can be also -specified using a numeric vector c(x, y); see details section. |
-
legend.title | -legend title. |
-
legend.labs | -character vector specifying legend labels. Used to replace -the names of the strata from the fit. Should be given in the same order as -those strata. |
-
y.text | -logical. Default is TRUE. If FALSE, the table y axis tick -labels will be hidden. |
-
y.text.col | -logical. Default value is FALSE. If TRUE, the table tick -labels will be colored by strata. |
-
fontsize | -text font size. |
-
font.family | -character vector specifying text element font family, e.g.: font.family = "Courier New". |
-
axes.offset | -logical value. Default is TRUE. If FALSE, set the plot axes -to start at the origin. |
-
ggtheme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
tables.theme | -function, ggplot2 theme name. Default value is
-theme_survminer. Allowed values include ggplot2 official themes: see
- |
-
a ggplot.
-ggrisktable
: Plot the number at risk table.
ggcumevents
: Plot the cumulative number of events table
ggcumcensor
: Plot the cumulative number of censor table
ggsurvtable
: Generic function to plot survival tables: risk.table, cumevents and cumcensor
-# Fit survival curves -#::::::::::::::::::::::::::::::::::::::::::::::: -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Survival tables -#::::::::::::::::::::::::::::::::::::::::::::::: -tables <- ggsurvtable(fit, data = lung, color = "strata", - y.text = FALSE) - -# Risk table -tables$risk.table-# Number of cumulative events -tables$cumevents-# Number of cumulative censoring -tables$cumcensor
Default theme for plots generated with survminer.
-theme_survminer( - base_size = 12, - base_family = "", - font.main = c(16, "plain", "black"), - font.submain = c(15, "plain", "black"), - font.x = c(14, "plain", "black"), - font.y = c(14, "plain", "black"), - font.caption = c(15, "plain", "black"), - font.tickslab = c(12, "plain", "black"), - legend = c("top", "bottom", "left", "right", "none"), - font.legend = c(10, "plain", "black"), - ... -) - -theme_cleantable(base_size = 12, base_family = "", ...)- -
base_size | -base font size |
-
---|---|
base_family | -base font family |
-
font.main, font.submain, font.caption, font.x, font.y, font.tickslab, font.legend | -a vector of length 3 -indicating respectively the size (e.g.: 14), the style (e.g.: "plain", -"bold", "italic", "bold.italic") and the color (e.g.: "red") of main title, subtitle, caption, -xlab and ylab, axis tick labels and legend, respectively. For example font.x = -c(14, "bold", "red"). Use font.x = 14, to change only font size; or use -font.x = "bold", to change only font face. |
-
legend | -character specifying legend position. Allowed values are one of -c("top", "bottom", "left", "right", "none"). Default is "top" side position. -to remove the legend use legend = "none". Legend position can be also -specified using a numeric vector c(x, y); see details section. |
-
... | -additional arguments passed to the function theme_survminer(). |
-
theme_survminer
: Default theme for survminer plots. A theme similar to theme_classic() with large font size.
theme_cleantable
: theme for drawing a clean risk table and cumulative
-number of events table. A theme similar to theme_survminer() without i)
-axis lines and, ii) x axis ticks and title.
--# Fit survival curves -#++++++++++++++++++++++++++++++++++++ -require("survival") -fit<- survfit(Surv(time, status) ~ sex, data = lung) - -# Basic survival curves -#++++++++++++++++++++++++++++++++++++ -ggsurv <- ggsurvplot(fit, data = lung, risk.table = TRUE, - main = "Survival curves", - submain = "Based on Kaplan-Meier estimates", - caption = "created with survminer" - ) - -# Change font size, style and color -#++++++++++++++++++++++++++++++++++++ -# Change font size, style and color at the same time -# Use font.x = 14, to change only font size; or use -# font.x = "bold", to change only font face. -ggsurv %+% theme_survminer( - font.main = c(16, "bold", "darkblue"), - font.submain = c(15, "bold.italic", "purple"), - font.caption = c(14, "plain", "orange"), - font.x = c(14, "bold.italic", "red"), - font.y = c(14, "bold.italic", "darkred"), - font.tickslab = c(12, "plain", "darkgreen") - ) - -# Clean risk table -# +++++++++++++++++++++++++++++ -ggsurv$table <- ggsurv$table + theme_cleantable() -ggsurv-
- Fit survival curves- - |
- |
---|---|
- Survival Curves-Summarize and visualize survival curves. - |
- |
- - | -Drawing Survival Curves Using ggplot2 |
-
- - | -Ggplots of Fitted Flexible Survival Models |
-
- - | -Arranging Multiple ggsurvplots |
-
- - | -Distribution of Events' Times |
-
- - | -Nice Summary of a Survival Curve |
-
-
|
- Determine the Optimal Cutpoint for Continuous Variables |
-
- - | -Multiple Comparisons of Survival Curves |
-
- Diagnostics of Cox Model- - |
- |
- - | -Diagnostic Plots for Cox Proportional Hazards Model with ggplot2 |
-
- - | -Functional Form of Continuous Variable in Cox Proportional Hazards Model |
-
- - | -Graphical Test of Proportional Hazards with ggplot2 |
-
- Summary of Cox Model- - |
- |
- - | -Forest Plot for Cox Proportional Hazards Model |
-
- - | -Adjusted Survival Curves for Cox Proportional Hazards Model |
-
- Competing Risks- - |
- |
- - | -Cumulative Incidence Curves for Competing Risks |
-
- Helpers- - |
- |
- - | -Plot Survival Tables |
-
- - | -Plot Survival Curves from Survival Summary Data Frame |
-
- - | -Plot a List of Survfit Objects |
-
- - | -Survival Curves of Grouped Data sets |
-
- - | -Add Survival Curves of Pooled Patients onto the Main Plot |
-
- - | -Combine a List of Survfit Objects on the Same Plot |
-
- - | -Facet Survival Curves into Multiple Panels |
-
- Data- - |
- |
- - | -Multiple Myeloma Data |
-
- - | -Bone Marrow Transplant |
-
- Others- - |
- |
- - | -Theme for Survminer Plots |
-
- - | -Add Components to a ggsurvplot |
-
Multiple Myeloma data extracted from publicly available gene - expression data (GEO Id: GSE4581).
-data("myeloma")- - -
A data frame with 256 rows and 12 columns.
molecular_group
Patients' molecular subgroups
chr1q21_status
Amplification status of the chromosome -1q21
treatment
treatment
event
survival status 0 = -alive, 1 = dead
time
Survival time in months
CCND1
Gene expression
CRIM1
Gene expression
DEPDC1
Gene expression
IRF4
Gene expression
TP53
Gene expression
WHSC1
Gene expression
The remaining columns (CCND1, CRIM1, DEPDC1, IRF4, TP53, WHSC1) correspond to -the gene expression level of specified genes.
- --#> molecular_group chr1q21_status treatment event time CCND1 CRIM1 -#> GSM50986 Cyclin D-1 3 copies TT2 0 69.24 9908.4 420.9 -#> GSM50988 Cyclin D-2 2 copies TT2 0 66.43 16698.8 52.0 -#> GSM50989 MMSET 2 copies TT2 0 66.50 294.5 617.9 -#> GSM50990 MMSET 3 copies TT2 1 42.67 241.9 11.9 -#> GSM50991 MAF <NA> TT2 0 65.00 472.6 38.8 -#> GSM50992 Hyperdiploid 2 copies TT2 0 65.20 664.1 16.9 -#> DEPDC1 IRF4 TP53 WHSC1 -#> GSM50986 523.5 16156.5 10.0 261.9 -#> GSM50988 21.1 16946.2 1056.9 363.8 -#> GSM50989 192.9 8903.9 1762.8 10042.9 -#> GSM50990 184.7 11894.7 946.8 4931.0 -#> GSM50991 212.0 7563.1 361.4 165.0 -#> GSM50992 341.6 16023.4 2096.3 569.2-
Calculate pairwise comparisons between group levels with - corrections for multiple testing.
-pairwise_survdiff(formula, data, p.adjust.method = "BH", na.action, rho = 0)- -
formula | -a formula expression as for other survival models, of the form -Surv(time, status) ~ predictors. |
-
---|---|
data | -a data frame in which to interpret the variables occurring in the -formula. |
-
p.adjust.method | -method for adjusting p values (see
- |
-
na.action | -a missing-data filter function. Default is -options()$na.action. |
-
rho | -a scalar parameter that controls the type of test. Allowed values -include 0 (for Log-Rank test) and 1 (for peto & peto test). |
-
Returns an object of class "pairwise.htest", which is a list - containing the p values.
-survival::survdiff
--library(survival) -library(survminer) -data(myeloma) - -# Pairwise survdiff -res <- pairwise_survdiff(Surv(time, event) ~ molecular_group, - data = myeloma) -res#> -#> Pairwise comparisons using Log-Rank test -#> -#> data: myeloma and molecular_group -#> -#> Cyclin D-1 Cyclin D-2 Hyperdiploid Low bone disease MAF -#> Cyclin D-2 0.723 - - - - -#> Hyperdiploid 0.943 0.723 - - - -#> Low bone disease 0.723 0.988 0.644 - - -#> MAF 0.644 0.447 0.523 0.485 - -#> MMSET 0.328 0.103 0.103 0.103 0.723 -#> Proliferation 0.103 0.038 0.038 0.062 0.485 -#> MMSET -#> Cyclin D-2 - -#> Hyperdiploid - -#> Low bone disease - -#> MAF - -#> MMSET - -#> Proliferation 0.527 -#> -#> P value adjustment method: BH-# Symbolic number coding -symnum(res$p.value, cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 0.1, 1), - symbols = c("****", "***", "**", "*", "+", " "), - abbr.colnames = FALSE, na = "")#> Cyclin D-1 Cyclin D-2 Hyperdiploid Low bone disease MAF MMSET -#> Cyclin D-2 -#> Hyperdiploid -#> Low bone disease -#> MAF -#> MMSET -#> Proliferation * * + -#> attr(,"legend") -#> [1] 0 ‘****’ 1e-04 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘+’ 0.1 ‘ ’ 1 \t ## NA: ‘’- -
R/surv_cutpoint.R
- surv_cutpoint.Rd
Determine the optimal cutpoint for one or multiple continuous - variables at once, using the maximally selected rank statistics from the - 'maxstat' R package. This is an outcome-oriented methods providing a - value of a cutpoint that correspond to the most significant relation with - outcome (here, survival).
surv_cutpoint()
: Determine the optimal cutpoint for each variable using 'maxstat'.
surv_categorize()
: Divide each variable values based on the cutpoint returned by surv_cutpoint()
.
surv_cutpoint( - data, - time = "time", - event = "event", - variables, - minprop = 0.1, - progressbar = TRUE -) - -surv_categorize(x, variables = NULL, labels = c("low", "high")) - -# S3 method for surv_cutpoint -summary(object, ...) - -# S3 method for surv_cutpoint -print(x, ...) - -# S3 method for surv_cutpoint -plot(x, variables = NULL, ggtheme = theme_classic(), bins = 30, ...) - -# S3 method for plot_surv_cutpoint -print(x, ..., newpage = TRUE)- -
data | -a data frame containing survival information (time, event) and -continuous variables (e.g.: gene expression data). |
-
---|---|
time, event | -column names containing time and event data, respectively. -Event values sould be 0 or 1. |
-
variables | -a character vector containing the names of variables of -interest, for wich we want to estimate the optimal cutpoint. |
-
minprop | -the minimal proportion of observations per group. |
-
progressbar | -logical value. If TRUE, show progress bar. Progressbar is -shown only, when the number of variables > 5. |
-
x, object | -an object of class surv_cutpoint |
-
labels | -labels for the levels of the resulting category. |
-
... | -other arguments. For plots, see ?ggpubr::ggpar |
-
ggtheme | -function, ggplot2 theme name. Default value is -theme_classic. Allowed values include ggplot2 official themes. see -?ggplot2::ggtheme. |
-
bins | -Number of bins for histogram. Defaults to 30. |
-
newpage | -open a new page. See |
-
surv_cutpoint(): returns an object of class 'surv_cutpoint', - which is a list with the following components:
maxstat - results for each variable (see ?maxstat::maxstat)
cutpoint: a data - frame containing the optimal cutpoint of each variable. Rows are variable - names and columns are c("cutpoint", "statistic").
data: a data frame - containing the survival data and the original data for the specified - variables.
minprop: the minimal proportion of observations per group.
not_numeric: contains data for non-numeric variables, in the context - where the user provided categorical variable names in the argument - variables.
surv_categorize(): returns an object of class - 'surv_categorize', which is a data frame containing the survival data and - the categorized variables.
-#> molecular_group chr1q21_status treatment event time CCND1 CRIM1 -#> GSM50986 Cyclin D-1 3 copies TT2 0 69.24 9908.4 420.9 -#> GSM50988 Cyclin D-2 2 copies TT2 0 66.43 16698.8 52.0 -#> GSM50989 MMSET 2 copies TT2 0 66.50 294.5 617.9 -#> GSM50990 MMSET 3 copies TT2 1 42.67 241.9 11.9 -#> GSM50991 MAF <NA> TT2 0 65.00 472.6 38.8 -#> GSM50992 Hyperdiploid 2 copies TT2 0 65.20 664.1 16.9 -#> DEPDC1 IRF4 TP53 WHSC1 -#> GSM50986 523.5 16156.5 10.0 261.9 -#> GSM50988 21.1 16946.2 1056.9 363.8 -#> GSM50989 192.9 8903.9 1762.8 10042.9 -#> GSM50990 184.7 11894.7 946.8 4931.0 -#> GSM50991 212.0 7563.1 361.4 165.0 -#> GSM50992 341.6 16023.4 2096.3 569.2-# 1. Determine the optimal cutpoint of variables -res.cut <- surv_cutpoint(myeloma, time = "time", event = "event", - variables = c("DEPDC1", "WHSC1", "CRIM1")) - -summary(res.cut)#> cutpoint statistic -#> DEPDC1 279.8 4.275452 -#> WHSC1 3205.6 3.361330 -#> CRIM1 82.3 1.968317-# 2. Plot cutpoint for DEPDC1 -# palette = "npg" (nature publishing group), see ?ggpubr::ggpar -plot(res.cut, "DEPDC1", palette = "npg")#> $DEPDC1#>#> time event DEPDC1 WHSC1 CRIM1 -#> GSM50986 69.24 0 high low high -#> GSM50988 66.43 0 low low low -#> GSM50989 66.50 0 low high high -#> GSM50990 42.67 1 low high low -#> GSM50991 65.00 0 low low low -#> GSM50992 65.20 0 high low low-# 4. Fit survival curves and visualize -library("survival") -fit <- survfit(Surv(time, event) ~DEPDC1, data = res.cat) -ggsurvplot(fit, data = res.cat, risk.table = TRUE, conf.int = TRUE)-
Wrapper arround the standard survfit() function to create - survival curves. Compared to the standard survfit() function, it supports also:
a list of data sets and/or a list of formulas,
a grouped data sets as generated by the function surv_group_by,
group.by option
There are many cases, where this function might be useful:
Case 1: One formula and One data set. - Example: You want to fit the survival curves of one biomarker/gene in a given data set. - This is the same as the standard survfit() function. Returns one survfit object.
Case 2: List of formulas and One data set. - Example: You want to fit the survival curves of a list of biormarkers/genes in the same data set. - Returns a named list of survfit objects in the same order as formulas.
Case 3: One formula and List of data sets. - Example: You want to fit survival curves of one biomarker/gene in multiple cohort of patients (colon, lung, breast). - Returns a named list of survfit objects in the same order as the data sets.
Case 4: List of formulas and List of data sets. - Example: You want to fit survival curves of multiple biomarkers/genes in multiple cohort of patients (colon, lung, breast). - Each formula will be applied to each of the data set in the data list. - Returns a named list of survfit objects.
Case 5: One formula and grouped data sets by one or two variables.
- Example: One might like to plot the survival curves of patients
- treated by drug A vs patients treated by drug B in a dataset grouped by TP53 and/or RAS mutations.
- In this case use the argument group.by
. Returns a named list of survfit objects.
Case 6. In a rare case you might have a list of formulas and a list of data sets, and - you might want to apply each formula to the mathcing data set with the same index/position in the list. - For example formula1 is applied to data 1, formula2 is applied to data 2, and so on ... - In this case formula and data lists should have the same length and you should specify the argument match.fd = TRUE ( stands for match formula and data). - Returns a named list of survfit objects.
The output of the surv_fit
() function can be directly handled by the following functions:
These functions return one element or a list of elements depending on the format of the input.
-surv_fit(formula, data, group.by = NULL, match.fd = FALSE, ...)- -
formula | -survival formula. See survfit.formula. Can be a list of formula. Named lists are recommended. |
-
---|---|
data | -a data frame in which to interpret the variables named in the formula. -Can be a list of data sets. Named lists are recommended. -Can be also a grouped dataset as generated by the function surv_group_by(). |
-
group.by | -a grouping variables to group the data set by. -A character vector containing the name of grouping variables. Should be of length <= 2. |
-
match.fd | -logical value. Default is FALSE. Stands for "match formula and data". -Useful only when you have a list of formulas and a list of data sets, and - you want to apply each formula to the matching data set with the same index/position in the list. - For example formula1 is applied to data 1, formula2 is applied to data 2, and so on .... - In this case use match.fd = TRUE. |
-
... | -Other arguments passed to the survfit.formula function. |
-
Returns an object of class survfit if one formula and one data set provided.
Returns a named list of survfit objects when input is a list of formulas and/or data sets.
-The same holds true when grouped data sets are provided or when the argument group.by
is specified.
If the names of formula and data lists are available, -the names of the resulting survfit objects list are obtained by collapsing the names of formula and data lists.
If the formula names are not available, the variables in the formulas are extracted and used to build the name of survfit object.
In the case of grouped data sets, the names of survfit object list are obtained by -collapsing the levels of grouping variables and the names of variables in the survival curve formulas.
--library("survival") -library("magrittr") - -# Case 1: One formula and One data set -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit <- surv_fit(Surv(time, status) ~ sex, - data = colon) -surv_pvalue(fit)#> variable pval method pval.txt -#> 1 sex 0.6107936 Log-rank p = 0.61- -# Case 2: List of formulas and One data set. -# - Different formulas are applied to the same data set -# - Returns a (named) list of survfit objects -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -# Create a named list of formulas -formulas <- list( - sex = Surv(time, status) ~ sex, - rx = Surv(time, status) ~ rx -) - -# Fit survival curves for each formula -fit <- surv_fit(formulas, data = colon) -surv_pvalue(fit)#> $`colon::sex` -#> variable pval method pval.txt -#> 1 sex 0.6107936 Log-rank p = 0.61 -#> -#> $`colon::rx` -#> variable pval method pval.txt -#> 1 rx 4.990735e-08 Log-rank p < 0.0001 -#>-# Case 3: One formula and List of data sets -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit <- surv_fit(Surv(time, status) ~ sex, - data = list(colon, lung)) -surv_pvalue(fit)#> $`colon::sex` -#> variable pval method pval.txt -#> 1 sex 0.6107936 Log-rank p = 0.61 -#> -#> $`lung::sex` -#> variable pval method pval.txt -#> 1 sex 0.001311165 Log-rank p = 0.0013 -#>- -# Case 4: List of formulas and List of data sets -# - Each formula is applied to each of the data in the data list -# - argument: match.fd = FALSE -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: - -# Create two data sets -set.seed(123) -colon1 <- dplyr::sample_frac(colon, 1/2) -set.seed(1234) -colon2 <- dplyr::sample_frac(colon, 1/2) - -# Create a named list of formulas -formula.list <- list( - sex = Surv(time, status) ~ sex, - adhere = Surv(time, status) ~ adhere, - rx = Surv(time, status) ~ rx -) - -# Fit survival curves -fit <- surv_fit(formula.list, data = list(colon1, colon2), - match.fd = FALSE)#> Warning: `combine()` is deprecated as of dplyr 1.0.0. -#> Please use `vctrs::vec_c()` instead. -#> This warning is displayed once every 8 hours. -#> Call `lifecycle::last_warnings()` to see where this warning was generated.surv_pvalue(fit)#> $`colon1::sex` -#> variable pval method pval.txt -#> 1 sex 0.8372769 Log-rank p = 0.84 -#> -#> $`colon2::sex` -#> variable pval method pval.txt -#> 1 sex 0.3901548 Log-rank p = 0.39 -#> -#> $`colon1::adhere` -#> variable pval method pval.txt -#> 1 adhere 0.0125047 Log-rank p = 0.013 -#> -#> $`colon2::adhere` -#> variable pval method pval.txt -#> 1 adhere 0.02104745 Log-rank p = 0.021 -#> -#> $`colon1::rx` -#> variable pval method pval.txt -#> 1 rx 0.001173476 Log-rank p = 0.0012 -#> -#> $`colon2::rx` -#> variable pval method pval.txt -#> 1 rx 4.449283e-05 Log-rank p < 0.0001 -#>- -# Grouped survfit -#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -# - Group by the treatment "rx" and fit survival curves on each subset -# - Returns a list of survfit objects -fit <- surv_fit(Surv(time, status) ~ sex, - data = colon, group.by = "rx") - -# Alternatively, do this -fit <- colon %>% - surv_group_by("rx") %>% - surv_fit(Surv(time, status) ~ sex, data = .) - -surv_pvalue(fit)#> $`rx.Obs::sex` -#> variable pval method pval.txt -#> 1 sex 0.5337304 Log-rank p = 0.53 -#> -#> $`rx.Lev::sex` -#> variable pval method pval.txt -#> 1 sex 0.2928911 Log-rank p = 0.29 -#> -#> $`rx.Lev+5FU::sex` -#> variable pval method pval.txt -#> 1 sex 0.0005623961 Log-rank p = 0.00056 -#>-
Split a data frame into multiple new data frames based on one or
- two grouping variables. The surv_group_by()
function takes an
- existing data frame and converts it into a grouped data frame where
- survival analysis are performed "by group".
surv_group_by(data, grouping.vars)- -
data | -a data frame |
-
---|---|
grouping.vars | -a character vector containing the name of grouping -variables. Should be of length <= 2 |
-
Returns an object of class surv_group_by
which is a
- tibble data frame with the following components:
one column for each grouping variables. Contains the levels.
a - coumn named "data", which is a named list of data subsets created by the - grouping variables. The list names are created by concatening the levels of - grouping variables.
-library("survival") -library("magrittr") - -# Grouping by one variables: treatment "rx" -#:::::::::::::::::::::::::::::::::::::::::: -grouped.d <- colon %>% - surv_group_by("rx") - -grouped.d # print#> # A tibble: 3 x 2 -#> # Groups: rx [3] -#> rx data -#> * <fct> <named list> -#> 1 Obs <tibble [630 × 15]> -#> 2 Lev <tibble [620 × 15]> -#> 3 Lev+5FU <tibble [608 × 15]>-grouped.d$data # Access to the data#> $rx.Obs -#> # A tibble: 630 x 15 -#> id study sex age obstruct perfor adhere nodes status differ extent -#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 3 1 0 71 0 0 1 7 1 2 2 -#> 2 3 1 0 71 0 0 1 7 1 2 2 -#> 3 5 1 1 69 0 0 0 22 1 2 3 -#> 4 5 1 1 69 0 0 0 22 1 2 3 -#> 5 8 1 1 54 0 0 0 1 0 2 3 -#> 6 8 1 1 54 0 0 0 1 0 2 3 -#> 7 13 1 1 64 0 0 0 1 1 2 3 -#> 8 13 1 1 64 0 0 0 1 1 2 3 -#> 9 15 1 1 46 1 0 0 4 0 2 3 -#> 10 15 1 1 46 1 0 0 4 0 2 3 -#> # … with 620 more rows, and 4 more variables: surg <dbl>, node4 <dbl>, -#> # time <dbl>, etype <dbl> -#> -#> $rx.Lev -#> # A tibble: 620 x 15 -#> id study sex age obstruct perfor adhere nodes status differ extent -#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 7 1 1 77 0 0 0 5 1 2 3 -#> 2 7 1 1 77 0 0 0 5 1 2 3 -#> 3 9 1 1 46 0 0 1 2 0 2 3 -#> 4 9 1 1 46 0 0 1 2 0 2 3 -#> 5 11 1 0 47 0 0 1 1 0 2 3 -#> 6 11 1 0 47 0 0 1 1 0 2 3 -#> 7 14 1 1 68 1 0 0 3 1 2 3 -#> 8 14 1 1 68 1 0 0 3 1 2 3 -#> 9 17 1 1 62 1 0 1 6 1 2 3 -#> 10 17 1 1 62 1 0 1 6 1 2 3 -#> # … with 610 more rows, and 4 more variables: surg <dbl>, node4 <dbl>, -#> # time <dbl>, etype <dbl> -#> -#> $`rx.Lev+5FU` -#> # A tibble: 608 x 15 -#> id study sex age obstruct perfor adhere nodes status differ extent -#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 1 1 1 43 0 0 0 5 1 2 3 -#> 2 1 1 1 43 0 0 0 5 1 2 3 -#> 3 2 1 1 63 0 0 0 1 0 2 3 -#> 4 2 1 1 63 0 0 0 1 0 2 3 -#> 5 4 1 0 66 1 0 0 6 1 2 3 -#> 6 4 1 0 66 1 0 0 6 1 2 3 -#> 7 6 1 0 57 0 0 0 9 1 2 3 -#> 8 6 1 0 57 0 0 0 9 1 2 3 -#> 9 10 1 0 68 0 0 0 1 0 2 3 -#> 10 10 1 0 68 0 0 0 1 0 2 3 -#> # … with 598 more rows, and 4 more variables: surg <dbl>, node4 <dbl>, -#> # time <dbl>, etype <dbl> -#>-# Grouping by two variables -#:::::::::::::::::::::::::::::::::::::::::: -grouped.d <- colon %>% - surv_group_by(grouping.vars = c("rx", "adhere")) - grouped.d#> # A tibble: 6 x 3 -#> # Groups: rx, adhere [6] -#> rx adhere data -#> * <fct> <dbl> <named list> -#> 1 Obs 0 <tibble [536 × 14]> -#> 2 Obs 1 <tibble [94 × 14]> -#> 3 Lev 0 <tibble [522 × 14]> -#> 4 Lev 1 <tibble [98 × 14]> -#> 5 Lev+5FU 0 <tibble [530 × 14]> -#> 6 Lev+5FU 1 <tibble [78 × 14]>-
Returns the median survival with upper and lower confidence - limits for the median at 95% confidence levels.
-surv_median(fit, combine = FALSE)- -
fit | -A survfit object. Can be also a list of survfit objects. |
-
---|---|
combine | -logical value. Used only when fit is a list of survfit objects. -If TRUE, combine the results for multiple fits. |
-
Returns for each fit, a data frame with the following column:
strata: strata/group names
median: median survival of - each group
lower: 95% lower confidence limit
upper: 95% upper - confidence limit
Returns a list of data frames when the input is a - list of survfit objects. If combine = TRUE, results are combined into one single data frame.
- ---library(survival) - -# Different survfits -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit.null <- surv_fit(Surv(time, status) ~ 1, data = colon) - -fit1 <- surv_fit(Surv(time, status) ~ sex, data = colon) - -fit2 <- surv_fit(Surv(time, status) ~ adhere, data = colon) - -fit.list <- list(sex = fit1, adhere = fit2) - -# Extract the median survival -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -surv_median(fit.null)#> strata median lower upper -#> 1 All 2351 2018 2910-surv_median(fit2)#> strata median lower upper -#> 1 adhere=0 2718 2213 NA -#> 2 adhere=1 1272 997 1885-surv_median(fit.list)#> $sex -#> strata median lower upper -#> 1 sex=0 2174 1752 NA -#> 2 sex=1 2527 1976 2910 -#> -#> $adhere -#> strata median lower upper -#> 1 adhere=0 2718 2213 NA -#> 2 adhere=1 1272 997 1885 -#>-surv_median(fit.list, combine = TRUE)#> id strata median lower upper -#> 1 sex sex=0 2174 1752 NA -#> 2 sex sex=1 2527 1976 2910 -#> 3 adhere adhere=0 2718 2213 NA -#> 4 adhere adhere=1 1272 997 1885-# Grouped survfit -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit.list2 <- surv_fit(Surv(time, status) ~ sex, data = colon, - group.by = "rx") -surv_median(fit.list2)#> $`rx.Obs::sex` -#> strata median lower upper -#> 1 sex=0 1981 1272 NA -#> 2 sex=1 1539 1195 2284 -#> -#> $`rx.Lev::sex` -#> strata median lower upper -#> 1 sex=0 1885 1275 NA -#> 2 sex=1 1548 1061 2593 -#> -#> $`rx.Lev+5FU::sex` -#> strata median lower upper -#> 1 sex=0 NA 2021 NA -#> 2 sex=1 NA NA NA -#>
Compute p-value from survfit objects or parse it when provided by
- the user. Survival curves are compared using the log-rank test (default).
- Other methods can be specified using the argument method
.
surv_pvalue( - fit, - data = NULL, - method = "survdiff", - test.for.trend = FALSE, - combine = FALSE, - ... -)- -
fit | -A survfit object. Can be also a list of survfit objects. |
-
---|---|
data | -data frame used to fit survival curves. Can be also a list of -data. |
-
method | -method to compute survival curves. Default is "survdiff" (or -"log-rank"). Allowed values are one of:
To specify method, one can
-use either the weights (e.g.: "1", "n", "sqrtN", ...), or the full name
-("log-rank", "gehan-breslow", "Peto-Peto", ...), or the acronyme LR, GB,
-.... Case insensitive partial match is allowed. |
-
test.for.trend | -logical value. Default is FALSE. If TRUE, returns the -test for trend p-values. Tests for trend are designed to detect ordered -differences in survival curves. That is, for at least one group. The test -for trend can be only performed when the number of groups is > 2. |
-
combine | -logical value. Used only when fit is a list of survfit objects. -If TRUE, combine the results for multiple fits. |
-
... | -other arguments including pval, pval.coord, pval.method.coord. -These are only used internally to specify custom pvalue, pvalue and pvalue -method coordinates on the survival plot. Normally, users don't need these -arguments. |
-
Return a data frame with the columns (pval, method, pval.txt and - variable). If additional arguments (pval, pval.coord, pval.method.coord, - get_coord) are specified, then extra columns (pval.x, pval.y, method.x and - method.y) are returned.
pval: pvalue
method: method - used to compute pvalues
pval.txt: formatted text ready to use for - annotating plots
pval.x, pval.y: x & y coordinates of the pvalue for - annotating the plot
method.x, method.y: x & y coordinates of pvalue - method
--library(survival) - -# Different survfits -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit.null <- surv_fit(Surv(time, status) ~ 1, data = colon) - -fit1 <- surv_fit(Surv(time, status) ~ sex, data = colon) - -fit2 <- surv_fit(Surv(time, status) ~ adhere, data = colon) - -fit.list <- list(sex = fit1, adhere = fit2) - -# Extract the median survival -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -surv_pvalue(fit.null)#> Warning: There are no survival curves to be compared. -#> This is a null model.#> variable pval method pval.txt -#> 1 NA-surv_pvalue(fit2, colon)#> variable pval method pval.txt -#> 1 adhere 0.0002670768 Log-rank p = 0.00027-surv_pvalue(fit.list)#> $sex -#> variable pval method pval.txt -#> 1 sex 0.6107936 Log-rank p = 0.61 -#> -#> $adhere -#> variable pval method pval.txt -#> 1 adhere 0.0002670768 Log-rank p = 0.00027 -#>-surv_pvalue(fit.list, combine = TRUE)#> id variable pval method pval.txt -#> 1 sex sex 0.6107936361 Log-rank p = 0.61 -#> 2 adhere adhere 0.0002670768 Log-rank p = 0.00027-# Grouped survfit -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -fit.list2 <- surv_fit(Surv(time, status) ~ sex, data = colon, - group.by = "rx") - -surv_pvalue(fit.list2)#> $`rx.Obs::sex` -#> variable pval method pval.txt -#> 1 sex 0.5337304 Log-rank p = 0.53 -#> -#> $`rx.Lev::sex` -#> variable pval method pval.txt -#> 1 sex 0.2928911 Log-rank p = 0.29 -#> -#> $`rx.Lev+5FU::sex` -#> variable pval method pval.txt -#> 1 sex 0.0005623961 Log-rank p = 0.00056 -#>-# Get coordinate for annotion of the survival plots -#::::::::::::::::::::::::::::::::::::::::::::::::::::::: -surv_pvalue(fit.list2, combine = TRUE, get_coord = TRUE)#> id variable pval method pval.txt pval.x pval.y -#> 1 rx.Obs::sex sex 0.5337303974 Log-rank p = 0.53 64.28 0.2 -#> 2 rx.Lev::sex sex 0.2928911335 Log-rank p = 0.29 66.58 0.2 -#> 3 rx.Lev+5FU::sex sex 0.0005623961 Log-rank p = 0.00056 66.18 0.2 -#> method.x method.y -#> 1 64.28 0.3 -#> 2 66.58 0.3 -#> 3 66.18 0.3-
Compared to the default summary() function, surv_summary()
- creates a data frame containing a nice summary from
- survfit
results.
surv_summary(x, data = NULL)- -
x | -an object of class survfit. |
-
---|---|
data | -a dataset used to fit survival curves. If not supplied then data -will be extracted from 'fit' object. |
-
An object of class 'surv_summary', which is a data frame with - the following columns:
time: the time points at which the - curve has a step.
n.risk: the number of subjects at risk at t.
n.event: the number of events that occur at time t.
n.censor: number - of censored events.
surv: estimate of survival.
std.err: - standard error of survival.
upper: upper end of confidence interval.
lower: lower end of confidence interval.
strata: stratification of survival curves.
In a situation, where survival curves have been fitted with one or more
- variables, surv_summary object contains extra columns representing the
- variables. This makes it possible to facet the output of
- ggsurvplot
by strata or by some combinations of factors.
surv_summary object has also an attribut named 'table' containing - information about the survival curves, including medians of survival with - confidence intervals, as well as, the total number of subjects and the - number of event in each curve.
- ---# Fit survival curves -require("survival") -fit <- survfit(Surv(time, status) ~ rx + adhere, data = colon) - -# Summarize -res.sum <- surv_summary(fit, data = colon) -head(res.sum)#> time n.risk n.event n.censor surv std.err upper lower -#> 1 20 536 1 0 0.9981343 0.001867414 1.0000000 0.9944878 -#> 2 43 535 1 0 0.9962687 0.002643394 1.0000000 0.9911204 -#> 3 45 534 1 0 0.9944030 0.003240519 1.0000000 0.9881072 -#> 4 59 533 1 0 0.9925373 0.003745345 0.9998501 0.9852780 -#> 5 72 532 1 0 0.9906716 0.004191364 0.9988435 0.9825667 -#> 6 77 531 1 0 0.9888060 0.004595738 0.9977529 0.9799393 -#> strata rx adhere -#> 1 rx=Obs, adhere=0 Obs 0 -#> 2 rx=Obs, adhere=0 Obs 0 -#> 3 rx=Obs, adhere=0 Obs 0 -#> 4 rx=Obs, adhere=0 Obs 0 -#> 5 rx=Obs, adhere=0 Obs 0 -#> 6 rx=Obs, adhere=0 Obs 0#> records n.max n.start events *rmean *se(rmean) median -#> rx=Obs, adhere=0 536 536 536 287 1884.796 57.33119 1896.0 -#> rx=Obs, adhere=1 94 94 94 58 1611.102 138.22808 1031.0 -#> rx=Lev, adhere=0 522 522 522 269 1890.034 60.48272 2012.0 -#> rx=Lev, adhere=1 98 98 98 64 1642.734 125.01246 1161.5 -#> rx=Lev+5FU, adhere=0 530 530 530 203 2285.457 55.43665 NA -#> rx=Lev+5FU, adhere=1 78 78 78 39 1946.302 152.86022 2174.0 -#> 0.95LCL 0.95UCL -#> rx=Obs, adhere=0 1447 2351 -#> rx=Obs, adhere=1 726 2077 -#> rx=Lev, adhere=0 1298 NA -#> rx=Lev, adhere=1 851 1895 -#> rx=Lev+5FU, adhere=0 NA NA -#> rx=Lev+5FU, adhere=1 993 NA- -