R: Plot Confounder-Adjusted Survival Curves

plot.adjustedsurv {adjustedCurves}

R Documentation

Plot Confounder-Adjusted Survival Curves

Description

A function to graphically display confounder-adjusted survival curves which where previously estimated using the adjustedsurv function. The user can customize the plot using a variety of options. Internally it uses the ggplot2 package, so additional not implemented features can be added using the standard ggplot2 syntax. This function also includes the option to use isotonic regression on the survival curves, which is of benefit if the estimated curves are not monotone.

Usage

## S3 method for class 'adjustedsurv'
plot(x, conf_int=FALSE, max_t=Inf,
     iso_reg=FALSE, force_bounds=FALSE,
     use_boot=FALSE, cif=FALSE, color=TRUE,
     linetype=FALSE, facet=FALSE,
     line_size=1, line_alpha=1, xlab="Time",
     ylab="Adjusted Survival Probability",
     title=NULL, subtitle=NULL, legend.title="Group",
     legend.position="right",
     gg_theme=ggplot2::theme_classic(),
     ylim=NULL, custom_colors=NULL,
     custom_linetypes=NULL,
     single_color=NULL, single_linetype=NULL,
     conf_int_alpha=0.4, steps=TRUE,
     x_breaks=ggplot2::waiver(), x_n_breaks=NULL,
     y_breaks=ggplot2::waiver(), y_n_breaks=NULL,
     additional_layers=list(),
     median_surv_lines=FALSE, median_surv_size=0.5,
     median_surv_linetype="dashed",
     median_surv_color="black", median_surv_alpha=1,
     median_surv_quantile=0.5,
     censoring_ind="none", censoring_ind_size=0.5,
     censoring_ind_alpha=1, censoring_ind_shape=17,
     censoring_ind_width=NULL,
     risk_table=FALSE, risk_table_type="n_at_risk",
     risk_table_stratify=FALSE, risk_table_height=0.25,
     risk_table_xlab=xlab, risk_table_ylab="default",
     risk_table_title="default", risk_table_title_size=14,
     risk_table_title_position="left",
     risk_table_y_vjust=5, risk_table_theme=gg_theme,
     risk_table_size=4.2, risk_table_alpha=1,
     risk_table_color="black", risk_table_family="sans",
     risk_table_fontface="plain", risk_table_reverse=TRUE,
     risk_table_stratify_color=TRUE,
     risk_table_custom_colors=custom_colors,
     risk_table_use_weights=TRUE, risk_table_digits=1,
     risk_table_format=TRUE, risk_table_warn=TRUE,
     risk_table_additional_layers=list(),
     ...)

Arguments

`x`	An `adjustedsurv` object created using the `adjustedsurv` function.
`conf_int`	A logical variable indicating whether the confidence intervals should be drawn.
`max_t`	A number indicating the latest survival time which is to be plotted.
`iso_reg`	A logical variable indicating whether the estimates should be monotonized using isotonic regression. See details.
`force_bounds`	A logical variable indicating whether the 0 and 1 bounds of the survival probabilities should be forced in the plot. See details.
`use_boot`	A logical variable denoting whether the bootstrapped estimates should be used for the curves and their confidence intervals. Can only be used if they were calculated. See `adjustedsurv`.
`cif`	If `TRUE` the cumulative incidence functions are drawn instead of the survival curves. Those are calculated by taking 1 - the adjusted survival probability. If `FALSE` (default) the usual survival curves are shown.
`color`	A logical variable indicating whether the curves should be colored differently. The `custom_colors` argument can be used to directly specify which colors to use. Alternatively the `single_color` argument can be used if everything should have the same color.
`linetype`	A logical variable indicating whether the curves should have different linetypes. The `custom_linetypes` argument can be used to directly specify which linetypes to use. Alternatively the `single_linetype` argument can be used if all curves should have the same linetype.
`facet`	A logical variable indicating whether the curves should be in different facets.
`line_size`	A number controlling the thickness of the survival curves.
`line_alpha`	A number controlling the transparency level of the survival curves.
`xlab`	A character string to be used as the X-Axis label of the plot.
`ylab`	A character string to be used as the Y-Axis label of the plot.
`title`	A character string to be used as the title of the plot. Set to `NULL` if no title should be used.
`subtitle`	A character string to be used as the subtitle of the plot. Set to `NULL` if no subtitle should be used.
`legend.title`	A character string to be used as the title of the legend. Set to `NULL` if no legend should be included.
`legend.position`	A character string specifying the position of the legend. Ignored if `legend_title=NULL`.
`gg_theme`	A `ggplot2` theme object which will be used for the plot.
`ylim`	A numeric vector of length two, specifying the limits of the Y-Axis. Set to `NULL` to use the `ggplot2` default values.
`custom_colors`	A (named) vector to specify the colors of each adjusted survival curve and possibly its confidence region. Set to `NULL` to use the `ggplot2` default values. Ignored if `color=FALSE`.
`custom_linetypes`	A (named) vector to specify the linetype of each adjusted survival curve. Set to `NULL` to use the `ggplot2` default values. Ignored if `color=FALSE`. Ignored if `linetype=FALSE`.
`single_color`	A single color to use for every survival curve, irrespective of group status. If `color` is specified as well this argument will override it, but also generate a warning. Set to `NULL` (default) to ignore this argument.
`single_linetype`	A single linetype to use for every survival curve, irrespective of group status. If `linetype` is specified as well this argument will override it, but also generate a warning. Set to `NULL` (default) to ignore this argument.
`conf_int_alpha`	A number indicating the level of transparency that should be used when drawing the confidence regions.
`steps`	A logical variable indicating whether the survival curves should be plotted as a step function or using straight lines. Straight lines should not be used with a simple Kaplan-Meier estimator. It is recommended to only use straight lines when a sufficiently fine grid of time points was used in the estimation step.
`x_breaks`	An optional numeric vector specifying the breaks on the x-axis. These will also be used in the risk table if `risk_table=TRUE`. Internally, this is passed to the `breaks` argument of the `scale_x_continuous` function.
`x_n_breaks`	An optional positive number specifying the number of breaks on the x-axis. Internally, this is passed to the `n.breaks` argument of the `scale_x_continuous` function.
`y_breaks`	An optional numeric vector specifying the breaks on the y-axis. Internally, this is passed to the `breaks` argument of the `scale_y_continuous` function.
`y_n_breaks`	An optional positive number specifying the number of breaks on the y-axis. Internally, this is passed to the `n.breaks` argument of the `scale_y_continuous` function.
`additional_layers`	An optional list of objects that should be added to the survival curve plot. Can be useful to customize the plot further when also using `risk_table=TRUE`.
`median_surv_lines`	Whether to draw indicator lines for the median survival times, which makes it easier to read those off the curves. Survival curves with undefined median survival times receive no lines.
`median_surv_size`	The size of the median survival indicator lines. Ignored if `median_surv_lines=FALSE`.
`median_surv_linetype`	The linetype of the median survival indicator lines. Ignored if `median_surv_lines=FALSE`.
`median_surv_color`	The color of the median survival indicator lines. Ignored if `median_surv_lines=FALSE`.
`median_surv_alpha`	The transparency level of the median survival indicator lines. Ignored if `median_surv_lines=FALSE`.
`median_surv_quantile`	The survival quantile which should be drawn. To draw the median survival time, set this parameter to 0.5 (default).
`censoring_ind`	What kind of indicator to plot for censored observations on the survival curves. Must be one of `"none"` (plotting no indicators at all, the default), `"lines"` (plotting small vertical lines) and `"points"` (plotting points). Those will be affected by `linetype` and `color` as well.
`censoring_ind_size`	A numeric value specifying the size of the censoring indicators. Ignored if `censoring_ind="none"`.
`censoring_ind_alpha`	A numeric value specifying the alpha level of the censoring indicators. Ignored if `censoring_ind="none"`.
`censoring_ind_shape`	A numeric value specifying the shape of the censoring indicators when using `censoring_ind="points"`. Ignored otherwise. For available shapes see `?geom_point`.
`censoring_ind_width`	A numeric value specifying the width of the censoring indicators. Ignored unless `censoring_ind="lines"`. By default (`censoring_ind_width=NULL`) the width of the censoring indicators is equal to 5 percent of the plot height.
`risk_table`	Either `TRUE` or `FALSE`, indicating whether a risk table should be added below the plot or not. Please see details before using this option, since there is still no agreed upon way to include them for adjusted survival curves. Internally, a separate plot is created for the table and added to the survival curve plot using the `plot_grid` function of the cowplot package. The output is still a `ggplot` object, but it may not be possible to change some aspects of that object anymore. Note also that all arguments starting with `risk_table_` indicate arguments used to customize risk tables. These will be ignored if `risk_table=FALSE`.
`risk_table_type`	A single character string specifying what kind of risk table should be created. Has to be one of `"n_at_risk"` (number at risk, the default), `"n_cens"` (cumulative number of censored people up to time `t`) or `"n_events"` (cumulative number of events up to time `t`). Note that these numbers will be weighted if and only if `risk_table_use_weights` is set to `TRUE` and the method used to adjust the survival curves was one of `c("iptw_km", "iptw_cox", "iptw_pseudo")`. Otherwise, unadjusted numbers are returned. See details.
`risk_table_stratify`	Either `TRUE` or `FALSE`, indicating whether the risk table should be stratified by `variable`.
`risk_table_height`	A single number between 0 and 1, indicating the proportion of the resulting plot that should be used for the risk table. By making this number larger, the "height" of the risk table is increased. Defaults to 0.25, e.g. about a quarter of the plot is reserved for the risk table.
`risk_table_xlab`	A single character string specifying the x-axis label for the risk table. By default, the same name is used as was used for the survival curve plot.
`risk_table_ylab`	A single character string specifying the y-axis label of the risk table. By default this is set to `"default"`, which is replaced internally depending on the supplied `risk_table_type`, `risk_table_use_weights` and `risk_table_stratify` arguments. Set to `NULL` to use no y-axis label.
`risk_table_title`	A single character string specifying the title of the risk table. By default this is set to `"default"`, which is replaced internally depending on the supplied `risk_table_type`, `risk_table_use_weights` and `risk_table_stratify` arguments. Set to `NULL` to use no title.
`risk_table_title_size`	A single positive number specifying the size of the title of the risk table.
`risk_table_title_position`	One of `c("left", "middle", "right")`, specifying where the title of the risk table should be put. Defaults to `"left"`.
`risk_table_y_vjust`	A single positive number specifying the distance between the y-axis and the y-axis label. Is only used when `risk_table_stratify=FALSE`. Set to 5 by default to make the y-axis label align with the y-axis label of the survival curve plot.
`risk_table_theme`	A `ggplot2` theme object which will be used for the risk table. Defaults to the same theme used for the survival curve plot, but may also be different.
`risk_table_size`	A single positive number specifying the size of the numbers used in the risk table.
`risk_table_alpha`	A single number between 0 and 1 specifying the alpha level of the numbers in the risk table.
`risk_table_color`	A single character string specifying the color of the numbers in the risk table. Only used when `risk_table_stratify=FALSE`. To change the colors when using `risk_table_stratify=TRUE`, use the `risk_table_custom_colors` argument.
`risk_table_family`	A single character string specifying the text family that should be used for the numbers in the risk table.
`risk_table_fontface`	A single character string specifying the fontface of the text used to write the numbers in the risk table. Must be one of `c("plain", "bold", "italic", "bold.italic")`. Defaults to `"plain"`.
`risk_table_reverse`	Whether to reverse the order of groups in the risk table when using `risk_table_stratify=TRUE`.
`risk_table_stratify_color`	Either `TRUE` or `FALSE`, indicating whether the risk table should have different colors for different groups. Only used when `risk_table_stratify=TRUE`.
`risk_table_custom_colors`	A vector of custom colors used when both `risk_table_stratify` and `risk_table_stratify_color` are set to `TRUE`. By default this uses the same colors as the survival curve plot, but it doesn't have to.
`risk_table_use_weights`	Whether to use weights in the calculation of the values shown in the risk table or not. By default weights are used when they were used in the calculation of the adjusted survival curves as well. Otherwise, unadjusted risk tables are returned. Note that since the sum of all weights is not necessarily equal to the number of individuals in the dataset (unless `stabilize=TRUE` was used in the original `adjustedsurv` call), these numbers may seem odd. If weights are used it is also highly likely that the results will not be integers. See `risk_table_digits`.
`risk_table_digits`	A single number specifying how many digits should be used when rounding the numbers presented in the risk table. Since unadjusted calculations always result in integer values, this argument is only relevant when weights or multiple imputation were used. See `?round`.
`risk_table_format`	Either `TRUE` or `FALSE`, indicating whether the `format()` function should be called on the numbers included in the risk table. This defaults to `TRUE`, so that all numbers have an equal number of digits after the decimal point. This is only important when using either multiply imputed data or weighted risk tables, because otherwise the numbers will only be integers anyways.
`risk_table_warn`	Either `TRUE` or `FALSE`, indicating whether the user should be warned about some subtleties when using risk tables. We strongly urge the user to read the details before ignoring the warnings. Set to `FALSE` to ignore all warnings.
`risk_table_additional_layers`	An optional list of objects that should be added to the risk table plot. Can be used to further customize the risk table using standard `ggplot2` syntax. Explained in more detail in the vignette on plot customization.
`...`	Currently not used.

Details

On the use of Risk Tables

Due to popular demand, this function now also supports adding risk tables below the survival curves as of version 0.12.0. In previous versions of this package, this was not implemented due to the unclear interpretation of the risk tables depending on the method. When using the standard Kaplan-Meier method, the number of people at risk (risk_table_type="n_at_risk") is used directly to obtain the survival curve estimates. Showing the number at risk or some related numbers (such as number of events etc.) makes sense and is a nice way to communicate uncertainty of the results. However, most methods in this package DO NOT use number of people at risk to arrive at the survival curve estimates. The only methods that do so are "km" (the classic unadjusted Kaplan-Meier method) as well as "iptw_km" and "iptw_cox". The latter two actually use weighted numbers at risk.

For the two methods relying on weighted numbers at risk, it may be reasonable to add weighted risk tables. This is done by default (risk_table_use_weights=TRUE) if risk_table=TRUE. However, since these numbers are weighted they no longer correspond to actually observed numbers, which may lead to two confusing things happening: (1) the numbers are not integers (for example 120.45 people at risk at t) and / or (2) the sum of the numbers far exceeds the number of actual observations. (1) may be "solved" by rounding (setting risk_table_digits=0) and (2) may be solved by enforcing the sum of the weights to be equal to the number of observations by setting stabilize=TRUE in the original adjustedsurv() call.

For all other methods it is unclear if the use of risk tables makes sense. Since no weights were used in their estimation, there is no way to get "adjusted" risk tables in that case. The therefore unadjusted numbers may not correspond to the shown adjusted survival curves. For example, the adjusted survival curves may extend to regions where there are actually 0 number of people at risk left. We recommend using only unstratified risk tables in this case (keeping risk_table_stratify=FALSE).

Isotonic Regression & Forcing Bounds

When using certain methods there is no guarantee that the resulting estimated survival curves are non-increasing. This is unfortunate since we know that it has to be the case. Isotonic regression can be used to fix this problem by ensuring that the survival curves are actually non-increasing everywhere, while also being as close to the observations as possible. Westling et al. (2020) showed mathematically that this does not add any systematic bias to the estimates. More information on the method can be found in Robertson et al. (1988). This adjustment can be done using this function by setting iso_reg to TRUE.

Similarly, some methods can produce estimates that lie outside the theoretical 0 and 1 bounds of probability. By setting force_bounds to TRUE these estimates are manually set to either 0 or 1 (whichever is closer).

Multiple Imputation

When multiple imputation was used to create the adjusted survival curves, the adjusted survival curves pooled over all imputations is displayed as described in the documentation page of the adjustedsurv function. When adding censoring indicators (censoring_ind=TRUE), only the censoring times actually observed in the original data will be displayed. Possible imputed censoring times are ignored. When using risk_table=TRUE on multiply imputed data, the numbers making up the risk table are calculated for each imputed dataset and are pooled using Rubins Rule afterwards.

Plot Customization

Although this function offers a lot of arguments to customize the resulting plot, there is always need for more customization options. Since the function returns a standard ggplot2 object, users can usually use standard ggplot2 syntax to add more geoms or change certain aspects of the plot. This, however, is not guaranteed to work when using risk_table=TRUE. The additional_layers and risk_table_additional_layers arguments can be used to sidestep this issue. This and more details on plot customization are given in the associated package vignette (vignette(topic="plot_customization", package="adjustedCurves")).

Other Plotting Options

If you prefer using the ggsurvplot syntax, you can also use the as_ggsurvplot_df function to extract a data.frame from the adjustedsurv object, which can be used directly to call the ggsurvplot_df function from the survminer package.

Value

Returns a ggplot2 object.

Author(s)

Robin Denz

References

Robin Denz, Renate Klaaßen-Mielke, and Nina Timmesfeld (2023). "A Comparison of Different Methods to Adjust Survival Curves for Confounders". In: Statistics in Medicine 42.10, pp. 1461-1479

Ted Westling, Mark J. van der Laan, and Marco Carone (2020). "Correcting an Estimator of a Multivariate Monotone Function with Isotonic Regression". In: Electronic Journal of Statistics 14, pp. 3032-3069

Tim Robertson, F. T. Wright, and R. L. Dykstra (1988). Order Restricted Statistical Inference. Hoboken: John Wiley & Sons

Examples

library(adjustedCurves)
library(survival)

if (requireNamespace("riskRegression") & requireNamespace("ggplot2")) {

library(ggplot2)

set.seed(42)

# simulate some data as example
sim_dat <- sim_confounded_surv(n=50, max_t=1.2)
sim_dat$group <- as.factor(sim_dat$group)

# estimate a cox-regression for the outcome
cox_mod <- coxph(Surv(time, event) ~ x1 + x2 + x3 + x4 + x5 + x6 + group,
                 data=sim_dat, x=TRUE)


# use it to calculate adjusted survival curves with bootstrapping
adjsurv <- adjustedsurv(data=sim_dat,
                        variable="group",
                        ev_time="time",
                        event="event",
                        method="direct",
                        outcome_model=cox_mod,
                        conf_int=TRUE,
                        bootstrap=TRUE,
                        n_boot=15) # should be much bigger in reality

# plot the curves with default values
plot(adjsurv)

# plot after applying isotonic regression
plot(adjsurv, iso_reg=TRUE)

# plot with confidence intervals estimated using asymptotic variances
plot(adjsurv, conf_int=TRUE)

# plot with confidence intervals estimated using bootstrapping
plot(adjsurv, conf_int=TRUE, use_boot=TRUE)

# plot with different linetypes only
plot(adjsurv, linetype=TRUE, color=FALSE, facet=FALSE)

# plot with median survival indicator lines
plot(adjsurv, median_surv_lines=TRUE)

# plot with crude risk table (recommended)
plot(adjsurv, conf_int=TRUE, risk_table=TRUE)

# plot with stratified risk table
# (only recommended for methods that use weights)
plot(adjsurv, conf_int=TRUE, risk_table=TRUE, risk_table_stratify=TRUE)

# plot with small lines indicating where observations were censored
plot(adjsurv, censoring_ind="lines")

# plot with points indicating where observations were censored
plot(adjsurv, censoring_ind="points", censoring_ind_size=4)

# adding further ggplot2 elements
plot(adjsurv) + theme_bw()
}
# NOTE: more examples are shown in the associated vignette

[Package adjustedCurves version 0.11.2 Index]