permutations {rsample} | R Documentation |
Permutation sampling
Description
A permutation sample is the same size as the original data set and is made
by permuting/shuffling one or more columns. This results in analysis
samples where some columns are in their original order and some columns
are permuted to a random order. Unlike other sampling functions in
rsample
, there is no assessment set and calling assessment()
on a
permutation split will throw an error.
Usage
permutations(data, permute = NULL, times = 25, apparent = FALSE, ...)
Arguments
data |
A data frame. |
permute |
One or more columns to shuffle. This argument supports
|
times |
The number of permutation samples. |
apparent |
A logical. Should an extra resample be added where the analysis is the standard data set. |
... |
These dots are for future extensions and must be empty. |
Details
The argument apparent
enables the option of an additional
"resample" where the analysis data set is the same as the original data
set. Permutation-based resampling can be especially helpful for computing
a statistic under the null hypothesis (e.g. t-statistic). This forms the
basis of a permutation test, which computes a test statistic under all
possible permutations of the data.
Value
A tibble
with classes permutations
, rset
, tbl_df
, tbl
, and
data.frame
. The results include a column for the data split objects and a
column called id
that has a character string with the resample
identifier.
Examples
permutations(mtcars, mpg, times = 2)
permutations(mtcars, mpg, times = 2, apparent = TRUE)
library(purrr)
resample1 <- permutations(mtcars, starts_with("c"), times = 1)
resample1$splits[[1]] %>% analysis()
resample2 <- permutations(mtcars, hp, times = 10, apparent = TRUE)
map_dbl(resample2$splits, function(x) {
t.test(hp ~ vs, data = analysis(x))$statistic
})