probcurehboot {npcure} | R Documentation |
Compute the Bootstrap Bandwidth for the Nonparametric Estimator of the Cure Probability
Description
This function computes the bootstrap bandwidth for the nonparametric estimator of the conditional probability of cure.
Usage
probcurehboot(x, t, d, dataset, x0, bootpars = controlpars())
Arguments
x |
If |
t |
If |
d |
If |
dataset |
An optional data frame in which the variables named in
|
x0 |
A numeric vector of covariate values where the local bootstrap bandwidth will be computed. |
bootpars |
A list of parameters controlling the process of
bandwidth selection. The default is the value returned by the
|
Details
The function computes the bootstrap bandwidth selector for the
nonparametric estimator of the cure probability at the covariate
values given by x0
. The bootstrap bandwidth is the minimizer of
a bootstrap version of the Mean Squared Error (MSE) of the cure rate
estimator, which is approximated by Monte Carlo by simulating a large
number, B
, of bootstrap resamples. The bootstrap MSE is the
bootstrap expectation of the difference between the value of the cure
rate estimator computed with the bootstrap sample in a grid of
bandwidths and its value computed with the original sample and a pilot
bandwidth. The bootstrap resamples are generated by using the simple
weighted bootstrap resampling method, fixing the covariate. This
method is equivalent to the simple weighted bootstrap of Li and Datta
(2001). All the parameters involved in the bootstrap bandwidth
selection process (number of bootstrap resamples, grid of bandwidths,
and pilot bandwidth) are typically set through the controlpars
function, whose output is passed to the bootpars
argument. See the help of controlpars
for details.
Given the local nature of bootstrap bandwidth selection, estimates
obtained from sets of bootstrap bandwidths may sometimes look
wiggly. To counter this behavior, the selected vector of bootstrap
bandwidths can be smoothed by computing a moving average (its order
being set by controlpars
). Then, the smoothed bandwidths are
contained in the hsmooth
component of the returned value.
Value
An object of S3 class 'npcure'. Formally, a list of components:
type |
The constant character string c("Bootstrap bandwidth", "cure"). |
x0 |
Grid of covariate values. |
h |
Selected local bootstrap bandwidths. |
hsmooth |
Smoothed selected local bootstrap bandwidths (optional) |
hgrid |
Grid of bandwidths used (optional). |
Author(s)
Ignacio López-de-Ullibarri [aut, cre], Ana López-Cheda [aut], Maria Amalia Jácome [aut]
References
Li, G., Datta, S. (2001). A bootstrap approach to nonparametric regression for right censored data. Annals of the Institute of Statistical Mathematics, 53: 708-729. https://doi.org/10.1023/A:1014644700806.
López-Cheda, A., Cao, R., Jácome, M. A., Van Keilegom, I. (2017). Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models. Computational Statistics & Data Analysis, 105: 144–165. https://doi.org/10.1016/j.csda.2016.08.002.
See Also
Examples
## Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) ## Covariate values
y <- rweibull(n, shape = .5*(x + 4)) ## True lifetimes
c <- rexp(n) ## Censoring values
p <- exp(2*x)/(1 + exp(2*x)) ## Probability of being susceptible
u <- runif(n)
t <- ifelse(u < p, pmin(y, c), c) ## Observed times
d <- ifelse(u < p, ifelse(y < c, 1, 0), 0) ## Uncensoring indicator
data <- data.frame(x = x, t = t, d = d)
## A vector of covariate values
vecx0 <- seq(-1.5, 1.5, by = .1)
## Computation of bootstrap local bandwidth at the values of 'vecx0'...
#### ... with the default control parameters
set.seed(1) ## Not needed, just for reproducibility.
hb1 <- probcurehboot(x, t, d, data, x0 = vecx0)
#### ... changing the default 'bootpars' through 'controlpars()', with
#### arguments:
#### (a) 'B = 1999' (1999 bootstrap resamples are generated),
#### (b) 'hbound = c(.2, 4)' and 'hl = 50' (a grid of 50 bandwidths
#### between 0.2 and 4 times the standardized interquartilic range of
#### the covariate values is built),
#### (c) 'hsave = TRUE' (the grid bandwidths are saved), and
#### (d) 'hsmooth = 7' (the bootstrap bandwidths are smoothed by a
#### moving average of 7-th order)
set.seed(1) ## Not needed, just for reproducibility.
hb2 <- probcurehboot(x, t, d, data, x0 = vecx0, bootpars =
controlpars(B = 1999, hbound = c(.2, 4), hl = 50, hsave = TRUE, hsmooth
= 7))
## Estimates of the conditional probability of cure at the covariate
## values of 'vecx0' with the selected bootstrap bandwidths
q1 <- probcure(x, t, d, data, x0 = vecx0, h = hb1$h)
q2 <- probcure(x, t, d, data, x0 = vecx0, h = hb2$h)
q2sm <- probcure(x, t, d, data, x0 = vecx0, h = hb2$hsmooth)
## A plot comparing the estimates obtained with the bootstrap bandwidths
plot(q1$x0, q1$q, type = "l", xlab = "Covariate", ylab =
"Cure probability", ylim = c(0,1))
lines(q2$x0, q2$q, type = "l", lty = 2)
lines(q2sm$x0, q2sm$q, type = "l", lty = 3)
lines(q1$x0, 1 - exp(2*q1$x0)/(1 + exp(2*q1$x0)), col = 2)
legend("topright", c("Estimate with 'hb1'", "Estimate with 'hb2'",
"Estimate with 'hb2' smoothed", "True"), lty = c(1, 2, 3, 1), col = c(1,
1, 1, 2))
## Example with the dataset 'bmt' of the 'KMsurv' package
## to study the probability of cure as a function of the age (z1).
data("bmt", package = "KMsurv")
x0 <- seq(quantile(bmt$z1, .05), quantile(bmt$z1, .95), length.out =
100)
## This might take a while
hb <- probcurehboot(z1, t2, d3, bmt, x0 = x0, bootpars =
controlpars(B = 1999, hbound = c(.2, 2), hl = 50, hsave = TRUE, hsmooth
= 10))
q.age <- probcure(z1, t2, d3, bmt, x0 = x0, h = hb$h)
q.age.smooth <- probcure(z1, t2, d3, bmt, x0 = x0, h = hb$hsmooth)
## Plot of estimated cure probability
plot(q.age$x0, q.age$q, type = "l", ylim = c(0, 1), xlab =
"Patient age (years)", ylab = "Cure probability")
lines(q.age.smooth$x0, q.age.smooth$q, col = 2)
legend("topright", c("Estimate with h bootstrap",
"Estimate with smoothed h bootstrap"), lty = 1, col = 1:2)