int_pctl {rsample} | R Documentation |
Bootstrap confidence intervals
Description
Calculate bootstrap confidence intervals using various methods.
Usage
int_pctl(.data, ...)
## S3 method for class 'bootstraps'
int_pctl(.data, statistics, alpha = 0.05, ...)
int_t(.data, ...)
## S3 method for class 'bootstraps'
int_t(.data, statistics, alpha = 0.05, ...)
int_bca(.data, ...)
## S3 method for class 'bootstraps'
int_bca(.data, statistics, alpha = 0.05, .fn, ...)
Arguments
.data |
A data frame containing the bootstrap resamples created using
|
... |
Arguments to pass to |
statistics |
An unquoted column name or |
alpha |
Level of significance. |
.fn |
A function to calculate statistic of interest. The
function should take an |
Details
Percentile intervals are the standard method of obtaining confidence intervals but require thousands of resamples to be accurate. T-intervals may need fewer resamples but require a corresponding variance estimate. Bias-corrected and accelerated intervals require the original function that was used to create the statistics of interest and are computationally taxing.
Value
Each function returns a tibble with columns .lower
,
.estimate
, .upper
, .alpha
, .method
, and term
.
.method
is the type of interval (eg. "percentile",
"student-t", or "BCa"). term
is the name of the estimate. Note
the .estimate
returned from int_pctl()
is the mean of the estimates from the bootstrap resamples
and not the estimate from the apparent model.
References
https://rsample.tidymodels.org/articles/Applications/Intervals.html
Davison, A., & Hinkley, D. (1997). Bootstrap Methods and their Application. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511802843
See Also
Examples
library(broom)
library(dplyr)
library(purrr)
library(tibble)
lm_est <- function(split, ...) {
lm(mpg ~ disp + hp, data = analysis(split)) %>%
tidy()
}
set.seed(52156)
car_rs <-
bootstraps(mtcars, 500, apparent = TRUE) %>%
mutate(results = map(splits, lm_est))
int_pctl(car_rs, results)
int_t(car_rs, results)
int_bca(car_rs, results, .fn = lm_est)
# putting results into a tidy format
rank_corr <- function(split) {
dat <- analysis(split)
tibble(
term = "corr",
estimate = cor(dat$sqft, dat$price, method = "spearman"),
# don't know the analytical std.err so no t-intervals
std.err = NA_real_
)
}
set.seed(69325)
data(Sacramento, package = "modeldata")
bootstraps(Sacramento, 1000, apparent = TRUE) %>%
mutate(correlations = map(splits, rank_corr)) %>%
int_pctl(correlations)