bcajack {bcaboot}R Documentation

Nonparametric bias-corrected and accelerated bootstrap confidence limits

Description

This routine computes nonparametric confidence intervals for bootstrap estimates. For reproducibility, save or set the random number state before calling this routine.

Usage

bcajack(
  x,
  B,
  func,
  ...,
  m = nrow(x),
  mr = 5,
  K = 2,
  J = 10,
  alpha = c(0.025, 0.05, 0.1, 0.16),
  verbose = TRUE
)

Arguments

x

an n \times p data matrix, rows are observed p-vectors, assumed to be independently sampled from target population. If p is 1 then x can be a vector.

B

number of bootstrap replications. It can also be a vector of B bootstrap replications of the estimated parameter of interest, computed separately.

func

function \hat{\theta}=func(x) computing estimate of the parameter of interest; func(x) should return a real value for any n^\prime \times p matrix x^\prime, n^\prime not necessarily equal to n

...

additional arguments for func.

m

an integer less than or equal to n; the routine collects the n rows of x into m groups to speed up the jackknife calculations for estimating the acceleration value a; typically m is 20 or 40 and does not have to exactly divide n. However, warnings will be shown.

mr

if m < n then mr repetions of the randomly grouped jackknife calculations are averaged.

K

a non-negative integer. If K > 0, bcajack also returns estimates of internal standard error, that is, of the variability due to stopping at B bootstrap replications rather than going on to infinity. These are obtained from a second type of jackknifing, taking an average of K separate jackknife estimates, each randomly splitting the B bootstrap replications into J groups.

J

the number of groups into which the bootstrap replications are split

alpha

percentiles desired for the bca confidence limits. One only needs to provide alpha values below 0.5; the upper limits are automatically computed

verbose

logical for verbose progress messages

Details

Bootstrap confidence intervals depend on three elements:

The first two of these depend only on the bootstrap distribution, and not how it is generated: parametrically or non-parametrically. Program bcajack can be used in a hybrid fashion in which the vector tt of B bootstrap replications is first generated from a parametric model.

So, in the diabetes example below, we might first draw bootstrap samples y^* \sim N(X\hat{\beta}, \hat{\sigma}^2 I) where \hat{\beta} and \hat{\sigma} were obtained from lm(y~X); each y^* would then provide a bootstrap replication tstar = rfun(cbind(X, ystar)). Then we could get bca intervals from ⁠bcajack(Xy, tt, rfun ....)⁠ with tt, the vector of B tstar values. The only difference from a full parametric bca analysis would lie in the nonparametric estimation of a, often a negligible error.

Value

a named list of several items

References

DiCiccio T and Efron B (1996). Bootstrap confidence intervals. Statistical Science 11, 189-228

Efron B (1987). Better bootstrap confidence intervals. JASA 82 171-200

B. Efron and B. Narasimhan. Automatic Construction of Bootstrap Confidence Intervals, 2018.

Examples

data(diabetes, package = "bcaboot")
Xy <- cbind(diabetes$x, diabetes$y)
rfun <- function(Xy) {
  y <- Xy[, 11]
  X <- Xy[, 1:10]
  summary(lm(y~X) )$adj.r.squared
}
set.seed(1234)
## n = 442 = 34 * 13
bcajack(x = Xy, B = 1000, func = rfun, m = 34, verbose = FALSE)

[Package bcaboot version 0.2-3 Index]