cv_partition {sparsediscrim} | R Documentation |
Randomly partitions data for cross-validation.
Description
For a vector of training labels, we return a list of cross-validation folds,
where each fold has the indices of the observations to leave out in the fold.
In terms of classification error rate estimation, one can think of a fold as a
the observations to hold out as a test sample set. Either the hold_out
size or the number of folds, num_folds
, can be specified. The number
of folds defaults to 10, but if the hold_out
size is specified, then
num_folds
is ignored.
Usage
cv_partition(y, num_folds = 10, hold_out = NULL, seed = NULL)
Arguments
y |
a vector of class labels |
num_folds |
the number of cross-validation folds. Ignored if
|
hold_out |
the hold-out size for cross-validation. See Details. |
seed |
optional random number seed for splitting the data for cross-validation |
Details
We partition the vector y
based on its length, which we treat as the
sample size, 'n'. If an object other than a vector is used in y
, its
length can yield unexpected results. For example, the output of
length(diag(3))
is 9.
Value
list the indices of the training and test observations for each fold.
Examples
# The following three calls to `cv_partition` yield the same partitions.
set.seed(42)
cv_partition(iris$Species)
cv_partition(iris$Species, num_folds = 10, seed = 42)
cv_partition(iris$Species, hold_out = 15, seed = 42)