abe.boot {abe}R Documentation

Bootstrapped Augmented Backward Elimination

Description

Performs Augmented backward elimination on re-sampled datasets using different bootstrap and re-sampling techniques.

Usage

abe.boot(fit, data = NULL, include = NULL, active = NULL, tau = 0.05,
  exp.beta = TRUE, exact = FALSE, criterion = "alpha", alpha = 0.2,
  type.test = "Chisq", type.factor = NULL, num.boot = 100,
  type.boot = c("bootstrap", "mn.bootstrap", "subsampling"),
  prop.sampling = 0.632)

Arguments

fit

An object of a class "lm", "glm" or "coxph" representing the fit. Note, the functions should be fitted with argument x=TRUE and y=TRUE.

data

data frame used when fitting the object fit.

include

a vector containing the names of variables that will be included in the final model. These variables are used as passive variables during modeling. These variables might be exposure variables of interest or known confounders. They will never be dropped from the working model in the selection process, but they will be used passively in evaluating change-in-estimate criteria of other variables. Note, variables which are not specified as include or active in the model fit are assumed to be active and passive variables.

active

a vector containing the names of active variables. These less important explanatory variables will only be used as active, but not as passive variables when evaluating the change-in-estimate criterion.

tau

Value that specifies the threshold of the relative change-in-estimate criterion. Default is set to 0.05.

exp.beta

Logical specifying if exponent is used in formula to standardize the criterion. Default is set to TRUE.

exact

Logical, specifies if the method will use exact change-in-estimate or approximated. Default is set to FALSE, which means that the method will use approximation proposed by Dunkler et al. Note, setting to TRUE can severely slow down the algorithm, but setting to FALSE can in some cases lead to a poor approximation of the change-in-estimate criterion.

criterion

String that specifies the strategy to select variables for the blacklist. Currently supported options are significance level 'alpha', Akaike information criterion 'AIC' and Bayesian information criterion 'BIC'. If you are using significance level, in that case you have to specify the value of 'alpha' (see parameter alpha). Default is set to "alpha".

alpha

Value that specifies the level of significance as explained above. Default is set to 0.2.

type.test

String that specifies which test should be performed in case the criterion = "alpha". Possible values are "F" and "Chisq" (default) for class "lm", "Rao", "LRT", "Chisq" (default), "F" for class "glm" and "Chisq" for class "coxph". See also drop1.

type.factor

String that specifies how to treat factors, see details, possible values are "factor" and "individual".

num.boot

number of bootstrap re-samples

type.boot

String that specifies the type of bootstrap. Possible values are "bootstrap", "mn.bootstrap", "subsampling", see details

prop.sampling

Sampling proportion. Only applicable for type.boot="mn.bootstrap" and type.boot="subsampling", defaults to 0.632. See details.

Details

type.boot can be bootstrap (n observations drawn from the original data with replacement), mn.bootstrap (m out of n observations drawn from the original data with replacement), subsampling (m out of n observations drawn from the original data without replacement), where m is [prop.sampling*n].

Value

an object of class abe for which summary and plot functions are available. A list with the following elements:

models the final models obtained after performing ABE on re-sampled datasets, each object in the list is of the same class as fit

alpha the vector of significance levels used

tau the vector of threshold values for the change-in-estimate

num.boot number of re-sampled datasets

criterion criterion used when constructing the black-list

all.vars a list of variables used when estimating fit

fit.or the initial model

Author(s)

Rok Blagus, rok.blagus@mf.uni-lj.si

Sladana Babic

References

Daniela Dunkler, Max Plischke, Karen Lefondre, and Georg Heinze. Augmented backward elimination: a pragmatic and purposeful way to develop statistical models. PloS one, 9(11):e113677, 2014.

Riccardo De Bin, Silke Janitza, Willi Sauerbrei and Anne-Laure Boulesteix. Subsampling versus Bootstrapping in Resampling-Based Model Selection for Multivariable Regression. Biometrics 72, 272-280, 2016.

See Also

abe, summary.abe, plot.abe

Examples

# simulate some data and fit a model

set.seed(1)
n=100
x1<-runif(n)
x2<-runif(n)
x3<-runif(n)
y<--5+5*x1+5*x2+ rnorm(n,sd=5)
dd<-data.frame(y=y,x1=x1,x2=x2,x3=x3)
fit<-lm(y~x1+x2+x3,x=TRUE,y=TRUE,data=dd)

# use ABE on 50 bootstrap re-samples considering different
# change-in-estimate thresholds and significance levels

fit.boot<-abe.boot(fit,data=dd,include="x1",active="x2",
tau=c(0.05,0.1),exp.beta=FALSE,exact=TRUE,
criterion="alpha",alpha=c(0.2,0.05),type.test="Chisq",
num.boot=50,type.boot="bootstrap")

summary(fit.boot)

# use ABE on 50 subsamples randomly selecting 50% of subjects
# considering different change-in-estimate thresholds and
# significance levels

fit.boot<-abe.boot(fit,data=dd,include="x1",active="x2",
tau=c(0.05,0.1),exp.beta=FALSE,exact=TRUE,
criterion="alpha",alpha=c(0.2,0.05),type.test="Chisq",
num.boot=50,type.boot="subsampling",prop.sampling=0.5)

summary(fit.boot)

[Package abe version 3.0.1 Index]