R: Nonparametric Kernel-Smoothed Stratified Mark-Specific...

kernel_sievePH {sievePH}

R Documentation

Nonparametric Kernel-Smoothed Stratified Mark-Specific Proportional Hazards Model with a Univariate Continuous Mark, Fully Observed in All Failures.

Description

kernel_sievePH implements estimation and hypothesis testing method of Sun et al. (2009) for a mark-specific proportional hazards model. The methods allow separate baseline mark-specific hazard functions for different baseline subgroups.

Usage

kernel_sievePH(
  eventTime,
  eventInd,
  mark,
  tx,
  zcov = NULL,
  strata = NULL,
  formulaPH = ~tx,
  tau = NULL,
  tband = NULL,
  hband = NULL,
  nvgrid = 100,
  a = NULL,
  b = NULL,
  ntgrid = NULL,
  nboot = 500,
  seed = NULL,
  maxit = 6
)

Arguments

`eventTime`	a numeric vector specifying the observed right-censored event time.
`eventInd`	a numeric vector indicating the event of interest (1 if event, 0 if right-censored).
`mark`	a numeric vector specifying a univariate continuous mark. No missing values are permitted for subjects with `eventInd = 1`. For subjects with `eventInd = 0`, the value(s) in `mark` should be set to `NA`.
`tx`	a numeric vector indicating the treatment group (1 if treatment, 0 if placebo).
`zcov`	a data frame with one row per subject specifying possibly time-dependent covariate(s) (not including `tx`). If no covariate is used, `zcov` should be set to the default of `NULL`.
`strata`	a numeric vector specifying baseline strata (`NULL` by default). If specified, a separate mark-specific baseline hazard is assumed for each stratum.
`formulaPH`	a one-sided formula object (on the right side of the `~` operator) specifying the linear predictor in the proportional hazards model. Available variables to be used in the formula include `tx` and variable(s) in `zcov`. By default, `formulaPH` is specified as `~ tx`.
`tau`	a numeric value specifying the duration of study follow-up period. Failures beyond `tau` are treated right-censored. There needs to be at least `10\%` of subjects (as a rule of thumb) remaining uncensored by `tau` for the estimation to be stable. By default, `tau` is set as the maximum of `eventTime`.
`tband`	a numeric value between 0 and `tau` specifying the bandwidth of the kernel smoothing function over time. By default, `tband` is set as (`tau`-min(`eventTime`))/5.
`hband`	a numeric value between 0 and 1 specifying the bandwidth of the kernel smoothing function over mark. By default, `hband` is set as `4\sigma n^{-1/3}` where `\sigma` is the estimated standard deviation of the observed marks for uncensored failure times and `n` is the number of subjects in the dataset. Larger bandwidths are recommended for higher percentages of missing marks.
`nvgrid`	an integer value (100 by default) specifying the number of equally spaced mark values between the minimum and maximum of the observed mark for which the treatment effects are evaluated.
`a`	a numeric value between the minimum and maximum of observed mark values specifying the lower bound of the range for testing the null hypotheses `H_{10}: HR(v) = 1` and `H_{20}: HR(v)` does not depend on `v`, for `v \in [a, b]`; By default, `a` is set as `(max(mark) - min(mark))/nvgrid + min(mark)`.
`b`	a numeric value between the minimum and maximum of observed mark specifying the upper bound of the range for testing the null hypotheses `H_{10}: HR(v) = 1` and `H_{20}: HR(v)` does not depend on `v`, for `v \in [a, b]`; By default, `b` is set as `max(mark)`.
`ntgrid`	an integer value (`NULL` by default) specifying the number of equally spaced time points for which the mark-specific baseline hazard functions are evaluated. If `NULL`, baseline hazard functions are not evaluated.
`nboot`	number of bootstrap iterations (500 by default) for simulating the distributions of test statistics. If `NULL`, the hypotheses tests are not performed.
`seed`	an integer specifying the random number generation seed for reproducing the test statistics and p-values. By default, a specific seed is not set.
`maxit`	Maximum number of iterations to attempt for convergence in estimation. The default is 6.

Details

kernel_sievePH analyzes data from a randomized placebo-controlled trial that evaluates treatment efficacy for a time-to-event endpoint with a continuous mark. The parameter of interest is the ratio of the conditional mark-specific hazard functions (treatment/placebo), which is based on a stratified mark-specific proportional hazards model. This model assumes no parametric form for the baseline hazard function nor the treatment effect across different mark values.

Value

An object of class kernel_sievePH which can be processed by summary.kernel_sievePH to obtain or print a summary of the results. An object of class kernel_sievePH is a list containing the following components:

H10: a data frame with test statistics (first row) and corresponding p-values (second row) for testing H_{10}: HR(v) = 1 for v \in [a, b]. Columns TSUP1 and Tint1 include test statistics and p-values for testing H_{10} vs. H_{1a}: HR(v) \neq 1 for any v \in [a, b] (general alternative). Columns TSUP1m and Tint1m include test statistics and p-values for testing H_{10} vs. H_{1m}: HR(v) \leq 1 with strict inequality for some v in [a, b] (monotone alternative). TSUP1 and TSUP1m are based on extensions of the classic Kolmogorov-Smirnov supremum-based test. Tint1 and Tint1m are based on generalizations of the integration-based Cramer-von Mises test. Tint1 and Tint1m involve integration of deviations over the whole range of the mark. If nboot is NULL, H10 is returned as NULL.
H20: a data frame with test statistics (first row) and corresponding p-values (second row) for testing H_{20}: HR(v) does not depend on v \in [a, b]. Columns TSUP2 and Tint2 include test statistics and p-values for testing H_{20} vs. H_{2a}: HR depends on v \in [a, b] (general alternative). Columns TSUP2m and Tint2m include test statistics and p-values for testing H_{20} vs. H_{2m}: HR increases as v increases \in [a, b] (monotone alternative). TSUP2 and TSUP2m are based on extensions of the classic Kolmogorov-Smirnov supremum-based test. Tint2 and Tint2m are based on generalizations of the integration-based Cramer-von Mises test. Tint2 and Tint2m involve integration of deviations over the whole range of the mark. If nboot is NULL, H20 is returned as NULL.
estBeta: a data frame summarizing point estimates and standard errors of the mark-specific coefficients for treatment at equally-spaced values between the minimum and the maximum of the observed mark values.
cBproc1: a data frame containing equally-spaced mark values in the column Mark, test processes Q^{(1)}(v) for observed data in the column Observed, and Q^{(1)}(v) for nboot independent sets of normal samples in the columns S1, S2, \cdots. If nboot is NULL, cBproc1 is returned as NULL.
cBproc2: a data frame containing equally-spaced mark values in the column Mark, test processes Q^{(2)}(v) for observed data in the column Observed, and Q^{(2)}(v) for nboot independent sets of normal samples in the columns S1, S2, \cdots. If nboot is NULL, cBproc2 is returned as NULL.
Lambda0: an array of dimension K x nvgrid x ntgrid for the kernel-smoothed baseline hazard function \lambda_{0k}, k = 1, \dots, K where K is the number of strata. If ntgrid is NULL (by default), Lambda0 is returned as NULL.

References

Sun, Y., Gilbert, P. B., & McKeague, I. W. (2009). Proportional hazards models with continuous marks. Annals of statistics, 37(1), 394.

Yang, G., Sun, Y., Qi, L., & Gilbert, P. B. (2017). Estimation of stratified mark-specific proportional hazards models under two-phase sampling with application to HIV vaccine efficacy trials. Statistics in biosciences, 9, 259-283.

Examples

set.seed(20240410)
beta <- 2.1
gamma <- -1.3
n <- 200
tx <- rep(0:1, each = n / 2)
tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma)))
cens <- runif(n, 0, 15)
eventTime <- pmin(tm, cens, 3)
eventInd <- as.numeric(tm <= pmin(cens, 3))
alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) }
mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2)
mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) /
  (beta - 2)
mark <- ifelse(eventInd == 1, c(mark0, mark1), NA)
# the true TE(v) curve underlying the data-generating mechanism is:
# TE(v) = 1 - exp{alpha(beta) + beta * v + gamma}

# complete-case estimation discards rows with a missing mark
fit <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5,
                      hband = 0.3, nvgrid = 20, nboot = 20)

[Package sievePH version 1.1 Index]