resample-package {resample} | R Documentation |
Overview of the resample package
Description
Resampling functions, including one- and two-sample bootstrap and permutation tests, with an easy-to-use syntax.
Details
See library(help = resample)
for version number, date, etc.
Data Sets
A list of datasets is at
resample-data
,
Main resampling functions
The main resampling functions are:
bootstrap
,
bootstrap2
,
permutationTest
,
permutationTest2
.
Methods
Methods for generic functions include:
print.resample
,
plot.resample
,
hist.resample
,
qqnorm.resample
, and
quantile.resample
.
Confidence Intervals
Functions that calculate confidence intervals for bootstrap
and bootstrap2
objects:
CI.bca
,
CI.bootstrapT
,
CI.percentile
,
CI.t
.
Samplers
Functions that generate indices for random samples:
samp.bootstrap
,
samp.permute
.
Low-level Resampling Function
This is called by the main resampling functions, but can also be
called directly:
resample
.
New Versions
I will post the newest versions to https://www.timhesterberg.net/r-packages. See that page to join a list for announcements of new versions.
Author(s)
Tim Hesterberg timhesterberg@gmail.com,
https://www.timhesterberg.net/bootstrap-and-resampling
Examples
data(Verizon)
ILEC <- with(Verizon, Time[Group == "ILEC"])
CLEC <- with(Verizon, Time[Group == "CLEC"])
#### Sections in this set of examples
### Different ways to specify the data and statistic
### Example with plots and confidence intervals.
### Different ways to specify the data and statistic
# This code is flexible; there are different ways to call it,
# depending on how the data are stored and on the statistic.
## One-sample Bootstrap
# Ordinary vector, give statistic as a function
bootstrap(CLEC, mean)
# Vector by name, give statistic as an expression
bootstrap(CLEC, mean(CLEC))
# Vector created by an expression, use the name 'data'
bootstrap(with(Verizon, Time[Group == "CLEC"]), mean(data))
# A column in a data frame; use the name of the column
temp <- data.frame(foo = CLEC)
bootstrap(temp, mean(foo))
# Put function arguments into an expression
bootstrap(CLEC, mean(CLEC, trim = .25))
# Put function arguments into a separate list
bootstrap(CLEC, mean, args.stat = list(trim = .25))
## One-sample jackknife
# Syntax is like bootstrap, e.g.
jackknife(CLEC, mean)
## One-sample permutation test
# To test H0: two variables are independent, exactly
# one of them just be permuted. For the CLEC data,
# we'll create an artificial variable.
CLEC2 <- data.frame(Time = CLEC, index = 1:length(CLEC))
permutationTest(CLEC2, cor(Time, index),
resampleColumns = "index")
# Could permute "Time" instead.
# resampleColumns not needed for variables outside 'data'
permutationTest(CLEC, cor(CLEC, 1:length(CLEC)))
### Two-sample problems
## Different ways to specify data and statistic
## Two-sample bootstrap
# Two data objects (one for each group)
bootstrap2(CLEC, data2 = ILEC, mean)
# data frame containing y variable(s) and a treatment variable
bootstrap2(Verizon, mean(Time), treatment = Group)
# treatment variable as a separate object
temp <- Verizon$Group
bootstrap2(Verizon$Time, mean, treatment = temp)
## Two-sample permutation test
# Like bootstrap2, e.g.
permutationTest2(CLEC, data2 = ILEC, mean)
### Example with plots and confidence intervals.
boot <- bootstrap2(CLEC, data2 = ILEC, mean)
perm <- permutationTest2(CLEC, data2 = ILEC, mean,
alternative = "greater")
par(mfrow = c(2,2))
hist(boot)
qqnorm(boot)
qqline(boot$replicates)
hist(perm)
# P-value
perm
# Standard error, and bias estimate
boot
# Confidence intervals
CI.percentile(boot) # Percentile interval
CI.t(boot) # t interval using bootstrap SE
# CI.bootstrapT and CI.bca do't currently support two-sample problems.
# Statistic can be multivariate.
# For the bootstrap2, it must have the estimate first, and a standard
# error second (don't need to divide by sqrt(n), that cancels out).
bootC <- bootstrap(CLEC, mean, seed = 0)
bootC2 <- bootstrap(CLEC, c(mean = mean(CLEC), sd = sd(CLEC)), seed = 0)
identical(bootC$replicates[, 1], bootC2$replicates[, 1])
CI.percentile(bootC)
CI.t(bootC)
CI.bca(bootC)
CI.bootstrapT(bootC2)
# The bootstrapT is the most accurate for skewed data, especially
# for small samples.
# By default the percentile and BCa intervals are "expanded", for
# better coverage in small samples. To turn this off:
CI.percentile(bootC, expand = FALSE)